Pages in topic: [1 2 3] > | Trados Studio - invalid character - generate target file Thread poster: Tom Fennell
| Tom Fennell United States Local time: 20:05 Russian to English + ...
Just thought I'd share how I got out of a pickle after encountering the following error when trying to generate a target translation in Trados Studio 2009 : "hexadecimal value 0x1F, is an invalid character. Line 1, position 37736" After the usual hours of research and trying, I found out that 1. the hexidecimal values starting in 0x are all non-printing characters (beep, backspace, etc.) which are illegal in... See more Just thought I'd share how I got out of a pickle after encountering the following error when trying to generate a target translation in Trados Studio 2009 : "hexadecimal value 0x1F, is an invalid character. Line 1, position 37736" After the usual hours of research and trying, I found out that 1. the hexidecimal values starting in 0x are all non-printing characters (beep, backspace, etc.) which are illegal in xml. From Seattle Software we find the following: Ever get a System.Xml.XmlException that says: “Hexadecimal value 0x[whatever] is an invalid character” …when trying to load a XML document using one of the .NET XML API objects like XmlReader, XmlDocument, or XDocument? Was “0x[whatever]” by chance one of these characters? 0×00 0×01 0×02 0×03 0×04 0×05 0×06 0×07 0×08 0x0B 0x0C 0x0E 0x0F 0×10 0×11 0×12 0×13 0×14 0×15 0x1A 0x1B 0x1C 0x1D 0x1E 0x1F 0×16 0×17 0×18 0×19 0x7F The problem that causes these XmlExceptions is that the data being read or loaded contains characters that are illegal according to the XML specifications. Almost always, these characters are in the ASCII control character range (think whacky characters like null, bell, backspace, etc). These aren’t characters that have any business being in XML data; they’re illegal characters that should be removed, usually having found their way into the data from file format conversions, like when someone tries to create an XML file from Excel data, or export their data to XML from a format that may be stored as binary. If you are really interested, you can look at the Wikipedia Article on ASCII Control Characters In my case the offending character seems to has been some sort of Unit Separator (code US). 2. finding the offending character is not that easy. I was able to find my offending character using 2 pieces of software that I had to download and install. You can file a a number of text editors here: - Wikipedia - List of text editors. I downloaded VIM here: Download Vim This program is not easy to use, but all you need are the line and character number in the lower right hand corner and the "Find" command in the edit menu. 3) You cannot just throw the offending .sdlxliff file into the text editor and find old invalid 0x1F at Line 1, position 37736. Not only is it not at position 337736, it is not 0x1F. For some reason it has been transformed. In order to find it, you need a specialized program. I converted the .sdlxliff file into a .tmx file and then used TMXValidator, a free program from the company MaxPrograms. Download here: TMX Validator Download MaxPrograms has an .xliff checker program, but it does not seem to check the .xml characteristics, just the .xliff structure, so you do need the .sdlxliff to .tmx conversion step. I opened the .tmx file with TMX Validator, and it identified that I was actually looking for "Error on line 1922 of document file:/C:/Users/Tom/Documents/filename.tmx: Character reference "XXXXX" is an invalid XML character." (The actual character combination will not print here on ProZ, because it is invalid, which Proz handles nicely.) The "clean invalid characters" option in TMXValidator does not seem to work. I then went back and found it using the VIM editor by using FIND "XXXXX". Going to line 1992 and finding XXXXX also works. First record the words around the character so you can find it again in your main memory and the .sdlxliff file. Then delete the horrid XXXXX. It is possible you may have multiple invalid characters - record and clean them all. Next you must create a new project. (Trados won't open a project which contains an invalid character in an .sdlxliff file). Quickest is to create a new project, then (strangely) you can open the .sdlxliff file from the old project and delete the offending character. Here it is indeed an invisible character: still it could be "felt" as an extra necessary tap on the delete key at the place identified by TMXValidator and VIM. You can also create a new project, then pre-translate the file using a new memory created from the cleaned .tmx file. NB! You must also edit your main memory and other memories updated by the project and make sure you also clean the offending character/segment there. ---------------- The whole process may be slow the first time, but it really doesn't take very long once you've got the hang of it. I suspect my Dragon Natural Speaking may have inserted the character, although it may have also been an unhappy keyboard combination. It would be nice if Trados had a more robust and user friendly check for such characters.
[Edited at 2011-03-07 18:31 GMT]
[Edited at 2011-03-07 18:54 GMT] ▲ Collapse | | | Strastran (X) France Local time: 03:05 French to English + ... Simple solution found by chance | May 30, 2011 |
Hi First of all, thanks Tom for your fantastic explanation and help! I had this error this morning, and after battling for a while I luckily stumbled upon a simple solution which I thought I'd share. As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to. What I did (and don't ask me why I did it) was to delete the segment immediately above that. I then tried to Save As Target again, at which point I g... See more Hi First of all, thanks Tom for your fantastic explanation and help! I had this error this morning, and after battling for a while I luckily stumbled upon a simple solution which I thought I'd share. As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to. What I did (and don't ask me why I did it) was to delete the segment immediately above that. I then tried to Save As Target again, at which point I got the same error but that took me to a different segment. Again, I deleted the segment directly above it. (When I say 'deleted the segment', I mean I removed the translated text from the target side). Then, when I did Save As Target, it worked! I then simply went back to the segments I'd deleted, found them in the TM and copied and pasted them manually into the target file. ▲ Collapse | | | Tom Fennell United States Local time: 20:05 Russian to English + ... TOPIC STARTER This sounds much easier....question | May 30, 2011 |
Patrick Stenson wrote: As you say, when clicking on the error in Studio, nothing appears to be 'wrong' with the segment it takes you to. Happily it has been a while since I had to deal with this, but as I remember, one of the most difficult aspects of the problem was that Trados did not actually give any accurate diagnostics regarding where the error was in the sdlxliff file. It only gave a line number reference which had to do with the full xml text only visible in a line editor, not in Trados itself. How did you find out which segments were the offending segments?
[Edited at 2011-05-30 18:51 GMT]
[Edited at 2011-05-30 18:52 GMT] | | | i) how to get to the segment ii) how to identify the offending character | May 5, 2012 |
1) to get to the segment, just click on the problem, you'll be taken to the segment BELOW the one that causes problems. 2) cut paste it into a word document, the offending character will appear (in my case a square, I suppose this has to do with some sort of encoding). Delete it in the Word document, then copy paste back to your target. It works! | |
|
|
Tom Fennell United States Local time: 20:05 Russian to English + ... TOPIC STARTER | adrienneiii United States Local time: 18:05 Spanish to English + ... I solved this using Tom's approach | Aug 22, 2012 |
Hi, there, Patrick, I think you got very lucky. I struggled with this problem yesterday, and the culprit was not the segment immediately above the one that Trados took me to when I clicked on the error message. In fact, I tried deleting several segments on either side of that one, and still nothing. So following Patrick's method, and that indicated by SDL (far too sparse suggestions/i... See more Hi, there, Patrick, I think you got very lucky. I struggled with this problem yesterday, and the culprit was not the segment immediately above the one that Trados took me to when I clicked on the error message. In fact, I tried deleting several segments on either side of that one, and still nothing. So following Patrick's method, and that indicated by SDL (far too sparse suggestions/instructions at http://kb.sdl.com/kb/?ArticleId=3671&source=Article&c=12&cid=23#tab:homeTab:crumb:7:artId:3671)) I downloaded this software, which was one of the first ones that came up after googling for "XML editor validation": http://xml-copy-editor.sourceforge.net/ I know absolutely NOTHING about this whole thing, so was lucky to take the correct first step in clicking on XML>Validate>DTD/XML Schema (whatever that means). That brought me to an enormously long line of text that encompassed 9 source and 9 target segments. Took me quite a while to scroll across and find the error in it. When I did (it was a # in the middle of a word when I pasted it into Word), I simply deleted that word in Trados, typed it out again and the world was a happy place again! The problem segment in my case was actually six above the one that Trados indicated. An alternative, therefore would be to paste a large number of surrounding segments into Word, as indicated by Patrick, and see if the offending character can be found that way, before going to the trouble of downloading the XML software. So, thank you so much Tom! Hope my description also helps others with this problem. Suggestion to SDL: couldn't you at least get the error message to indicate the relevant range of segments (ie. those encompassed by the XML text line)? Thanks! ▲ Collapse | | | | Tom Fennell United States Local time: 20:05 Russian to English + ... TOPIC STARTER Yes Notepad++ is better | Oct 8, 2012 |
Hi Remy, Yes, I've now started using Notepad++ also, and I suspect it is much simpler to use it instead of VIM. I'm hoping the newer versions of Trados deal automatically with these characters! Best, Tom | |
|
|
Thank you all | Nov 20, 2012 |
Although in my case it was caused by a faulty tag (I inserted it using the Ctrl + comma shortcut, where various tags were combined in a group), after I have copy-pasted the original tags from the source, I was able to save the target file.
[Редактировалось 2012-11-20 13:16 GMT] | | | hexadecimal-value-0x1f-is-an-invalid | Feb 1, 2013 |
Yeah, I can deal with it. Thanks for your instruction.
[Edited at 2013-02-01 03:43 GMT] | | | Firefox to locate illegal characters | Feb 1, 2013 |
Hi all, Thank you for your suggestions. Another way to spot the offending character would be to add the ".xml" extension to the sdlxliff file (e.g. "translation.sdlxliff" to "translation.sdlxliff.xml") and open the file in Mozilla Firefox. The browser would clearly indicate where the illegal character is located. You can leave the file open in Firefox while fixing the file in a text editor like Notepad++. Save the file in Notepad++ and refresh t... See more Hi all, Thank you for your suggestions. Another way to spot the offending character would be to add the ".xml" extension to the sdlxliff file (e.g. "translation.sdlxliff" to "translation.sdlxliff.xml") and open the file in Mozilla Firefox. The browser would clearly indicate where the illegal character is located. You can leave the file open in Firefox while fixing the file in a text editor like Notepad++. Save the file in Notepad++ and refresh the browser to check if the file is OK. This also works with TMX or TTX files. Lorenzo
[Edited at 2013-02-01 12:01 GMT] ▲ Collapse | | | Lisa Ritchie Germany Local time: 03:05 Member (2010) German to English + ... Thank you!!! | Apr 19, 2013 |
I used Patrick's solution and it was very fast and effective. Phew! Patrick, you just saved my life. Thank you! | |
|
|
We've run into this error most commonly with non-breaking hyphens, a character that Word inserts in numbered captions. These are represented in Studio using a placeable tag, and generally cause no problems, but if a segment containing this tag also generates some kind of QA message (due to tags missing, tags added, or tag order changed), then Studio notices that the tag represents an invalid character, and writes that to the internal QA message log. The problem is that Studio includes the inva... See more We've run into this error most commonly with non-breaking hyphens, a character that Word inserts in numbered captions. These are represented in Studio using a placeable tag, and generally cause no problems, but if a segment containing this tag also generates some kind of QA message (due to tags missing, tags added, or tag order changed), then Studio notices that the tag represents an invalid character, and writes that to the internal QA message log. The problem is that Studio includes the invalid character itself in the log as an entity, thereby breaking its own XML. Background aside, we've had pretty good luck just doing global search-and-replace on the problem file to turn the problematic &#x... entity with a regular hyphen. This avoids any need to create new projects. Your mileage may vary, but that works for us. -- Erik Anderson ▲ Collapse | | | Celeste Klein United States Local time: 20:05 English to Spanish + ... Thanks once again for your awesome explanations :) | Jun 10, 2013 |
I recently ran into this same issue (the invalid hexadecimal etc. etc. error message) when trying to export a file for review. Tom's solution (also explained in less detail on the SDL website) worked just perfect FYI... Just to add some info to the discussion for other desperate Trados users researching this topic in the future: The "culprit" in my sdlxiff file was a paragraph symbol commonly found in leg... See more I recently ran into this same issue (the invalid hexadecimal etc. etc. error message) when trying to export a file for review. Tom's solution (also explained in less detail on the SDL website) worked just perfect FYI... Just to add some info to the discussion for other desperate Trados users researching this topic in the future: The "culprit" in my sdlxiff file was a paragraph symbol commonly found in legal documents (esp. in arbitration cases) instead of the word "paragraph" for reference or citation purposes. It literally just looks like the paragraph mark you see in Word when you want to view formatting symbols. I just replaced each instance of the symbol by the abbreviation "para." (also common in this type of document) in the source file and created a new project. Best, Celeste. ▲ Collapse | | | Ehab Tantawy Local time: 03:05 Member (2006) English to Arabic + ...
Many Thanks to all Prozians discussed this issue thoroughly here and a special Thanks for Remy Actually, I followed your instructions on the SDL XLIFF file and a strange problem appeared that the Trados Studio (2014) it self mentioning the same error message for the project itself and then closes immediately. So, now I am not able to open the application itself. However, I opened [.sdlproj] using Notepad ++ and caught the erroneous characters ( ) and replaced with none as instructed in the link sent by Remy. I exported the file for Review and started working safely! Again, thanks all for these great info. Wish you a nice time. Best, Ehab | | | Pages in topic: [1 2 3] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » Trados Studio - invalid character - generate target file Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Trados Studio 2022 Freelance | The leading translation software used by over 270,000 translators.
Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop
and cloud solution, empowering you to work in the most efficient and cost-effective way.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |