TagEditor German character corruption in XML file
Thread poster: SylviaMB
Sep 14, 2010

I'm running TagEditor 7.1.0.719 on Windows XP and am translating an XML file from English into German. When I open a match of any kind that contains German characters, those characters show up with a particular corruption: The German character plus the character directly following it are rendered as a question mark.

The TM export shows the characters correctly, and I can use the TM on other files without problems.

When I correct the characters and save the target file as XML, the characters appear correctly. But when the same segment gets pulled into another open segment, the characters are corrupt again.

I should say that I did not receive an .ini file, because I am not working with an agency but a direct client on this one. I generated my own .ini file based on the file I received. I have worked with HMTL (and I am pretty sure with XML) files before without encountering this problem.

Any ideas?


 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:41
Member (2003)
Polish to German
+ ...
Did you select entity conversion? Sep 14, 2010

Maybe you need to convert entities in your ini - this could do the trick.
Or you select another font for displaying the TTX.
What you describe looks like code page or font problems - I did not have that for XML files, but quite often with Word.


 

SylviaMB
TOPIC STARTER
Characters still corrupted Sep 15, 2010

Thanks for that suggestion, Jerzy.

Yes, I remember similar things happening in Word. Changing the display font did not do the trick, and selecting different entity conversions in the Tag Settings didn't, either. Besides, the Help recommends leaving just the Default XML set for XML files. Adding things like Latin 1, etc. did not make a difference.

The encoding is UTF-8. I suspect it's something that needs to be fixed on the programmer's part, but I haven't a clue what to tell him to fix.

Sylvia


 

Antoní­n Otáhal
Local time: 22:41
Member (2005)
English to Czech
+ ...
unipad Sep 15, 2010

You can try and check if it is really utf8 using SC unipad http://www.unipad.org

Antonin


 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:41
Member (2003)
Polish to German
+ ...
When changing the settings in the INI file, did you reload the XML? Sep 15, 2010

Chaning the settings in the ini does not affect the ttx you created previously.
Did you recreate the ttx from xml each time you changed settings in the ini?
If yes, then sorry - I do not have any further ideas.
If not, please try to convert entities and recreate the ttx.


 

SylviaMB
TOPIC STARTER
TagEditor German character corruption in XML file Sep 15, 2010

Thank you, Antonin and Jerzey, for those suggestions.

None off them fixed the problem, but I appreciate you taking the time.

Sylvia


 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:41
Member (2003)
Polish to German
+ ...
Strange problem Sep 15, 2010

Would you send me your source file, the ini and the ttx for testing?
My mail is info at czopik dot com.


 

SylviaMB
TOPIC STARTER
Problem identified Sep 15, 2010

Jerzey,

thanks for offering to do that. While preparing a sample file to send you, I hit upon some new lines of investigation that eventually solved the problem.

In case someone has a similar problem in the future:

It had nothing to do with the .ini file or code pages. Instead, a previous, similar file I translated must have used a strange font, and while everything showed up correctly in the translation that time, the segments were all saved with weird characters. The reason that existing translations units in the TM were not overwritten with my corrected segments was that somehow, probably during an import, the TM settings were changed to "merge" instead of "overwrite," so TagEditor kept pulling the corrupted segments.

I am working on fixing the TM.

Thanks a bunch. Your suggestions let me on the right track.

Sylvia


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TagEditor German character corruption in XML file

Advanced search







SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search