TagEditor German character corruption in XML file
Thread poster: SylviaMB
Sep 14, 2010

I'm running TagEditor 7.1.0.719 on Windows XP and am translating an XML file from English into German. When I open a match of any kind that contains German characters, those characters show up with a particular corruption: The German character plus the character directly following it are rendered as a question mark.

The TM export shows the characters correctly, and I can use the TM on other files without problems.

When I correct the characters and save the target file as XML, the characters appear correctly. But when the same segment gets pulled into another open segment, the characters are corrupt again.

I should say that I did not receive an .ini file, because I am not working with an agency but a direct client on this one. I generated my own .ini file based on the file I received. I have worked with HMTL (and I am pretty sure with XML) files before without encountering this problem.

Any ideas?


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:25
Member (2003)
Polish to German
+ ...
Did you select entity conversion? Sep 14, 2010

Maybe you need to convert entities in your ini - this could do the trick.
Or you select another font for displaying the TTX.
What you describe looks like code page or font problems - I did not have that for XML files, but quite often with Word.


Direct link Reply with quote
 
SylviaMB
TOPIC STARTER
Characters still corrupted Sep 15, 2010

Thanks for that suggestion, Jerzy.

Yes, I remember similar things happening in Word. Changing the display font did not do the trick, and selecting different entity conversions in the Tag Settings didn't, either. Besides, the Help recommends leaving just the Default XML set for XML files. Adding things like Latin 1, etc. did not make a difference.

The encoding is UTF-8. I suspect it's something that needs to be fixed on the programmer's part, but I haven't a clue what to tell him to fix.

Sylvia


Direct link Reply with quote
 

Antoní­n Otáhal
Local time: 22:25
Member (2005)
English to Czech
+ ...
unipad Sep 15, 2010

You can try and check if it is really utf8 using SC unipad http://www.unipad.org

Antonin


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:25
Member (2003)
Polish to German
+ ...
When changing the settings in the INI file, did you reload the XML? Sep 15, 2010

Chaning the settings in the ini does not affect the ttx you created previously.
Did you recreate the ttx from xml each time you changed settings in the ini?
If yes, then sorry - I do not have any further ideas.
If not, please try to convert entities and recreate the ttx.


Direct link Reply with quote
 
SylviaMB
TOPIC STARTER
TagEditor German character corruption in XML file Sep 15, 2010

Thank you, Antonin and Jerzey, for those suggestions.

None off them fixed the problem, but I appreciate you taking the time.

Sylvia


Direct link Reply with quote
 

Jerzy Czopik  Identity Verified
Germany
Local time: 22:25
Member (2003)
Polish to German
+ ...
Strange problem Sep 15, 2010

Would you send me your source file, the ini and the ttx for testing?
My mail is info at czopik dot com.


Direct link Reply with quote
 
SylviaMB
TOPIC STARTER
Problem identified Sep 15, 2010

Jerzey,

thanks for offering to do that. While preparing a sample file to send you, I hit upon some new lines of investigation that eventually solved the problem.

In case someone has a similar problem in the future:

It had nothing to do with the .ini file or code pages. Instead, a previous, similar file I translated must have used a strange font, and while everything showed up correctly in the translation that time, the segments were all saved with weird characters. The reason that existing translations units in the TM were not overwritten with my corrected segments was that somehow, probably during an import, the TM settings were changed to "merge" instead of "overwrite," so TagEditor kept pulling the corrupted segments.

I am working on fixing the TM.

Thanks a bunch. Your suggestions let me on the right track.

Sylvia


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TagEditor German character corruption in XML file

Advanced search







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search