Pages in topic:   [1 2 3] >
Trados Studio - invalid character - generate target file
Thread poster: Tom Fennell
Tom Fennell
Tom Fennell
United States
Local time: 20:05
Russian to English
+ ...
Mar 7, 2011

Just thought I'd share how I got out of a pickle after encountering the following error when trying to generate a target translation in Trados Studio 2009 :

"hexadecimal value 0x1F, is an invalid character. Line 1, position 37736"

After the usual hours of research and trying, I found out that

1. the hexidecimal values starting in 0x are all non-printing characters (beep, backspace, etc.) which are illegal in
... See more
Just thought I'd share how I got out of a pickle after encountering the following error when trying to generate a target translation in Trados Studio 2009 :

"hexadecimal value 0x1F, is an invalid character. Line 1, position 37736"

After the usual hours of research and trying, I found out that

1. the hexidecimal values starting in 0x are all non-printing characters (beep, backspace, etc.) which are illegal in xml.

From Seattle Software we find the following:

Ever get a System.Xml.XmlException that says:

“Hexadecimal value 0x[whatever] is an invalid character”

…when trying to load a XML document using one of the .NET XML API objects like XmlReader, XmlDocument, or XDocument? Was “0x[whatever]” by chance one of these characters?

0×00
0×01
0×02
0×03
0×04
0×05
0×06
0×07
0×08
0x0B
0x0C
0x0E
0x0F
0×10
0×11
0×12
0×13
0×14
0×15
0x1A
0x1B
0x1C
0x1D
0x1E
0x1F
0×16
0×17
0×18
0×19
0x7F

The problem that causes these XmlExceptions is that the data being read or loaded contains characters that are illegal according to the XML specifications. Almost always, these characters are in the ASCII control character range (think whacky characters like null, bell, backspace, etc). These aren’t characters that have any business being in XML data; they’re illegal characters that should be removed, usually having found their way into the data from file format conversions, like when someone tries to create an XML file from Excel data, or export their data to XML from a format that may be stored as binary.


If you are really interested, you can look at the Wikipedia Article on ASCII Control Characters

In my case the offending character seems to has been some sort of Unit Separator (code US).

2. finding the offending character is not that easy.

I was able to find my offending character using 2 pieces of software that I had to download and install.

You can file a a number of text editors here: - Wikipedia - List of text editors.

I downloaded VIM here: Download Vim

This program is not easy to use, but all you need are the line and character number in the lower right hand corner and the "Find" command in the edit menu.

3) You cannot just throw the offending .sdlxliff file into the text editor and find old invalid 0x1F at Line 1, position 37736. Not only is it not at position 337736, it is not 0x1F. For some reason it has been transformed. In order to find it, you need a specialized program.

I converted the .sdlxliff file into a .tmx file and then used TMXValidator, a free program from the company MaxPrograms. Download here: TMX Validator Download

MaxPrograms has an .xliff checker program, but it does not seem to check the .xml characteristics, just the .xliff structure, so you do need the .sdlxliff to .tmx conversion step.

I opened the .tmx file with TMX Validator, and it identified that I was actually looking for

"Error on line 1922 of document file:/C:/Users/Tom/Documents/filename.tmx: Character reference "XXXXX" is an invalid XML character." (The actual character combination will not print here on ProZ, because it is invalid, which Proz handles nicely.)

The "clean invalid characters" option in TMXValidator does not seem to work.

I then went back and found it using the VIM editor by using FIND "XXXXX". Going to line 1992 and finding XXXXX also works. First record the words around the character so you can find it again in your main memory and the .sdlxliff file. Then delete the horrid XXXXX. It is possible you may have multiple invalid characters - record and clean them all.

Next you must create a new project. (Trados won't open a project which contains an invalid character in an .sdlxliff file). Quickest is to create a new project, then (strangely) you can open the .sdlxliff file from the old project and delete the offending character. Here it is indeed an invisible character: still it could be "felt" as an extra necessary tap on the delete key at the place identified by TMXValidator and VIM.

You can also create a new project, then pre-translate the file using a new memory created from the cleaned .tmx file.

NB! You must also edit your main memory and other memories updated by the project and make sure you also clean the offending character/segment there.

----------------
The whole process may be slow the first time, but it really doesn't take very long once you've got the hang of it.

I suspect my Dragon Natural Speaking may have inserted the character, although it may have also been an unhappy keyboard combination.

It would be nice if Trados had a more robust and user friendly check for such characters.

[Edited at 2011-03-07 18:31 GMT]

[Edited at 2011-03-07 18:54 GMT]
Collapse


 
Strastran (X)
Strastran (X)
France
Local time: 03:05
French to English
+ ...
Simple solution found by chance May 30, 2011

Hi

First of all, thanks Tom for your fantastic explanation and help!

I had this error this morning, and after battling for a while I luckily stumbled upon a simple solution which I thought I'd share. As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to. What I did (and don't ask me why I did it) was to delete the segment immediately above that. I then tried to Save As Target again, at which point I g
... See more
Hi

First of all, thanks Tom for your fantastic explanation and help!

I had this error this morning, and after battling for a while I luckily stumbled upon a simple solution which I thought I'd share. As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to. What I did (and don't ask me why I did it) was to delete the segment immediately above that. I then tried to Save As Target again, at which point I got the same error but that took me to a different segment. Again, I deleted the segment directly above it.

(When I say 'deleted the segment', I mean I removed the translated text from the target side).

Then, when I did Save As Target, it worked!

I then simply went back to the segments I'd deleted, found them in the TM and copied and pasted them manually into the target file.
Collapse


 
Tom Fennell
Tom Fennell
United States
Local time: 20:05
Russian to English
+ ...
TOPIC STARTER
This sounds much easier....question May 30, 2011

Patrick Stenson wrote:

As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to.



Happily it has been a while since I had to deal with this, but as I remember, one of the most difficult aspects of the problem was that Trados did not actually give any accurate diagnostics regarding where the error was in the sdlxliff file. It only gave a line number reference which had to do with the full xml text only visible in a line editor, not in Trados itself.

How did you find out which segments were the offending segments?

[Edited at 2011-05-30 18:51 GMT]

[Edited at 2011-05-30 18:52 GMT]


 
Lamontagne
Lamontagne  Identity Verified
France
Local time: 03:05
i) how to get to the segment ii) how to identify the offending character May 5, 2012

1) to get to the segment, just click on the problem, you'll be taken to the segment BELOW the one that causes problems.

2) cut paste it into a word document, the offending character will appear (in my case a square, I suppose this has to do with some sort of encoding). Delete it in the Word document, then copy paste back to your target. It works!


 
Tom Fennell
Tom Fennell
United States
Local time: 20:05
Russian to English
+ ...
TOPIC STARTER
Simple! May 21, 2012

Thanks Lamontagne!

 
adrienneiii
adrienneiii
United States
Local time: 18:05
Spanish to English
+ ...
I solved this using Tom's approach Aug 22, 2012

Hi, there, Patrick, I think you got very lucky. I struggled with this problem yesterday, and the culprit was not the segment immediately above the one that Trados took me to when I clicked on the error message. In fact, I tried deleting several segments on either side of that one, and still nothing.

So following Patrick's method, and that indicated by SDL (far too sparse suggestions/i
... See more
Hi, there, Patrick, I think you got very lucky. I struggled with this problem yesterday, and the culprit was not the segment immediately above the one that Trados took me to when I clicked on the error message. In fact, I tried deleting several segments on either side of that one, and still nothing.

So following Patrick's method, and that indicated by SDL (far too sparse suggestions/instructions at http://kb.sdl.com/kb/?ArticleId=3671&source=Article&c=12&cid=23#tab:homeTab:crumb:7:artId:3671)) I downloaded this software, which was one of the first ones that came up after googling for "XML editor validation":

http://xml-copy-editor.sourceforge.net/

I know absolutely NOTHING about this whole thing, so was lucky to take the correct first step in clicking on XML>Validate>DTD/XML Schema (whatever that means).

That brought me to an enormously long line of text that encompassed 9 source and 9 target segments. Took me quite a while to scroll across and find the error in it. When I did (it was a # in the middle of a word when I pasted it into Word), I simply deleted that word in Trados, typed it out again and the world was a happy place again!

The problem segment in my case was actually six above the one that Trados indicated. An alternative, therefore would be to paste a large number of surrounding segments into Word, as indicated by Patrick, and see if the offending character can be found that way, before going to the trouble of downloading the XML software.

So, thank you so much Tom! Hope my description also helps others with this problem.

Suggestion to SDL: couldn't you at least get the error message to indicate the relevant range of segments (ie. those encompassed by the XML text line)?

Thanks!
Collapse


 
Remy Blaettler
Remy Blaettler
Local time: 03:05
German to English
+ ...
You can fix it in Notepad++ too Oct 8, 2012

Or in any UTF-8 compatible text editor.

Just open the sdlxliff file and replace all  with an empty string.

Here are my instructions again:

http://tradoserrors.tumblr.com/post/32808305537/today-hexadecimal-value-0x1f-is-an-invalid


 
Tom Fennell
Tom Fennell
United States
Local time: 20:05
Russian to English
+ ...
TOPIC STARTER
Yes Notepad++ is better Oct 8, 2012

Hi Remy,

Yes, I've now started using Notepad++ also, and I suspect it is much simpler to use it instead of VIM.

I'm hoping the newer versions of Trados deal automatically with these characters!

Best,

Tom


 
Svetlana Podkolzina
Svetlana Podkolzina  Identity Verified
Portugal
English to Russian
+ ...
Thank you all Nov 20, 2012

Although in my case it was caused by a faulty tag (I inserted it using the Ctrl + comma shortcut, where various tags were combined in a group), after I have copy-pasted the original tags from the source, I was able to save the target file.

[Редактировалось 2012-11-20 13:16 GMT]


 
Thao Phuong Thi Pham
Thao Phuong Thi Pham
Vietnam
Local time: 08:05
English to Vietnamese
hexadecimal-value-0x1f-is-an-invalid Feb 1, 2013

Yeah, I can deal with it. Thanks for your instruction.

[Edited at 2013-02-01 03:43 GMT]


 
Lorenzo Cordini
Lorenzo Cordini
Local time: 03:05
English
Firefox to locate illegal characters Feb 1, 2013

Hi all,

Thank you for your suggestions.

Another way to spot the offending character would be to add the ".xml" extension to the sdlxliff file (e.g. "translation.sdlxliff" to "translation.sdlxliff.xml") and open the file in Mozilla Firefox.

The browser would clearly indicate where the illegal character is located.
You can leave the file open in Firefox while fixing the file in a text editor like Notepad++.
Save the file in Notepad++ and refresh t
... See more
Hi all,

Thank you for your suggestions.

Another way to spot the offending character would be to add the ".xml" extension to the sdlxliff file (e.g. "translation.sdlxliff" to "translation.sdlxliff.xml") and open the file in Mozilla Firefox.

The browser would clearly indicate where the illegal character is located.
You can leave the file open in Firefox while fixing the file in a text editor like Notepad++.
Save the file in Notepad++ and refresh the browser to check if the file is OK.

This also works with TMX or TTX files.

Lorenzo

[Edited at 2013-02-01 12:01 GMT]
Collapse


 
Lisa Ritchie
Lisa Ritchie  Identity Verified
Germany
Local time: 03:05
Member (2010)
German to English
+ ...
Thank you!!! Apr 19, 2013

I used Patrick's solution and it was very fast and effective. Phew! Patrick, you just saved my life. Thank you!

 
ErikAnderson3
ErikAnderson3
Local time: 18:05
Word files? Apr 28, 2013

We've run into this error most commonly with non-breaking hyphens, a character that Word inserts in numbered captions. These are represented in Studio using a placeable tag, and generally cause no problems, but if a segment containing this tag also generates some kind of QA message (due to tags missing, tags added, or tag order changed), then Studio notices that the tag represents an invalid character, and writes that to the internal QA message log. The problem is that Studio includes the inva... See more
We've run into this error most commonly with non-breaking hyphens, a character that Word inserts in numbered captions. These are represented in Studio using a placeable tag, and generally cause no problems, but if a segment containing this tag also generates some kind of QA message (due to tags missing, tags added, or tag order changed), then Studio notices that the tag represents an invalid character, and writes that to the internal QA message log. The problem is that Studio includes the invalid character itself in the log as an entity, thereby breaking its own XML.

Background aside, we've had pretty good luck just doing global search-and-replace on the problem file to turn the problematic &#x... entity with a regular hyphen. This avoids any need to create new projects.

Your mileage may vary, but that works for us.

-- Erik Anderson
Collapse


 
Celeste Klein
Celeste Klein
United States
Local time: 20:05
English to Spanish
+ ...
Thanks once again for your awesome explanations :) Jun 10, 2013

I recently ran into this same issue (the invalid hexadecimal etc. etc. error message) when trying to export a file for review.

Tom's solution (also explained in less detail on the SDL website) worked just perfect

FYI... Just to add some info to the discussion for other desperate Trados users researching this topic in the future: The "culprit" in my sdlxiff file was a paragraph symbol commonly found in leg
... See more
I recently ran into this same issue (the invalid hexadecimal etc. etc. error message) when trying to export a file for review.

Tom's solution (also explained in less detail on the SDL website) worked just perfect

FYI... Just to add some info to the discussion for other desperate Trados users researching this topic in the future: The "culprit" in my sdlxiff file was a paragraph symbol commonly found in legal documents (esp. in arbitration cases) instead of the word "paragraph" for reference or citation purposes. It literally just looks like the paragraph mark you see in Word when you want to view formatting symbols.

I just replaced each instance of the symbol by the abbreviation "para." (also common in this type of document) in the source file and created a new project.

Best,
Celeste.
Collapse


 
Ehab Tantawy
Ehab Tantawy  Identity Verified
Local time: 03:05
Member (2006)
English to Arabic
+ ...
Perfect! Nov 18, 2013

Remy Blaettler wrote:

Or in any UTF-8 compatible text editor.

Just open the sdlxliff file and replace all  with an empty string.

Here are my instructions again:

http://tradoserrors.tumblr.com/post/32808305537/today-hexadecimal-value-0x1f-is-an-invalid


Many Thanks to all Prozians discussed this issue thoroughly here and a special Thanks for Remy

Actually, I followed your instructions on the SDL XLIFF file and a strange problem appeared that the Trados Studio (2014) it self mentioning the same error message for the project itself and then closes immediately.

So, now I am not able to open the application itself.

However, I opened [.sdlproj] using Notepad ++ and caught the erroneous characters (  ) and replaced with none as instructed in the link sent by Remy.

I exported the file for Review and started working safely!


Again, thanks all for these great info.

Wish you a nice time.

Best,
Ehab


 
Pages in topic:   [1 2 3] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Trados Studio - invalid character - generate target file







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »