Entities do not convert after using the Snippet plugin (SDL 2007)
Thread poster: motuzik
motuzik
Slovakia
Local time: 02:12
Aug 28, 2009

Hi all,

I have an .xml document containing all of the text to be translated within a CDATA tag (used as a container of the html formatted text included within the .xml file).

Now,

1. I was successful at using the Snippet 1.0 plugin to parse the data contained within the CDATA tag - i.e. mark all the html formatting tags as untranslatable, to separate them from the actual text.

2. I still had to activate entity conversion for the diacritic characters to be declared,
e.g. č = č (html) 010D (unicode)
etc. etc.

3. But, when I defined all the html entities to be converted as required, they do not convert, resulting in the web environment, I am importing the files to, not displaying these characters replacing them by question marks..


Does anybody know how to solve this, what I assume to be a Snippet Plugin vs. Entity Conversion conflict?
This assumption is made on the basis that when editing the .xml and removing the CDATA strings from the file the entities do convert. However I cannot use this workaround, as with 500 files and the Dbase still growing, my solution has to be some more reliable, and I am tired of workarounds in Trados anyway, aren't you?


Thanks all for your kind words of advice.

Matus

[Edited at 2009-08-28 11:43 GMT]

[Edited at 2009-08-28 11:45 GMT]


Direct link Reply with quote
 
dwalsh
Local time: 19:12
Does anyone have a solution? Oct 14, 2009

I am also having this same issue. Does anyone know of a solution?

Direct link Reply with quote
 

Stanislav Pokorny  Identity Verified
Czech Republic
Local time: 02:12
English to Czech
+ ...
Little experience Oct 14, 2009

Hi all,
my experience is that converting CE characters to entities rarely works in XMLs. Therefore I always deselect the option when defining the INI file.
If you let Trados convert CE characters into entities and try to open the XML in Internet Explorer after that, you will receive an error. If I then use directly e.g. "š" instead of ampersandscaron;, everything is working nicely.

[Upraveno: 2009-10-14 17:30 GMT]


Direct link Reply with quote
 
motuzik
Slovakia
Local time: 02:12
TOPIC STARTER
Different problem focus Oct 15, 2009

Hi,

I fear the problem here is of a somewhat different nature. Converting Added Latin 1 and Latin 2 chars, as well as custom ones you can define under "User defined" values works just fine within .xml. Unfortunately only until you switch on the Snippet plugin. I think it may have to do something with how the Snippet parses the CDATA html sections embedded within the .xml. Honestly however, I cannot understand why, as what the snippet does is, it parses the CDATA sections and creates a new .ttx with the formatting tags marked. Then it should be pretty easy to just open this new .ttx, translate, and save target as, and the program should simply replace entities as told to in the entities settings. Unfortunately the save target as does not produce the hoped-for results as it inserts all the diacritic chars in their WYSIWYG form - and that's the point where you really get into trouble with the .xml...

The impression that this incorrect processing on the side of Tageditor evokes is that when not applying the Snippet plugin, Tageditor can save in the hexadecimal html encoding (entities), but as soon as you parse an .xml using the Snippet, it ignores the coding or maybe just changes the output file format and ignores the coding as a consequence?

Whichever, the entities settings are completely omitted in this case (snippet plugin). Simply editing the original .xml and replacing all CDATA tags by any working string; opening the .xml without using the Snippet (now that you have removed CDATA it would be a waste of time anyway); translating (with the correct entity values defined for conversion); saving the target file; and finally opening the saved .xml and replacing the working string with CDATA again does the job. However, you have to admit it cannot be used in any workflow where hundreds of .xml files need to be processed regularly, unless you intend to employ 4 additional highly trained chimpanzees to help you find/replace CDATA...

Why this does not work as it should, I do not know, it is as with most features in Trados, you always do look forward when thinking how they'll help you save time, but once you try them out, it's all quite on the contrary...

Matus


Direct link Reply with quote
 

Vitor Hugo Alves
Ireland
Local time: 01:12
English to Portuguese
RegEx Oct 15, 2009

I am wondering if writing a different regular expression for that would not help with that conflict and then you would not have to use the convert entity option?????
Which regular expression are you using?


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Entities do not convert after using the Snippet plugin (SDL 2007)

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search