Tagging/Removing Embedded HTML code in XLIFF
Thread poster: slickrock22

slickrock22
Local time: 21:59
Aug 26, 2011

Can someone help me understand the steps needed to have SDL Trados 2009 SP3 to import an XLIFF file and then interpret embedded HTML in the translation source and convert to tags so that when I attempt to match against TM that does not have the same embedded HTML and that I get a better match? I currently can import the XLIFF with no problem and it works great. It is the additional embedded HTML content that is giving me fits.

Here is the example.

I am importing using
... See more
Can someone help me understand the steps needed to have SDL Trados 2009 SP3 to import an XLIFF file and then interpret embedded HTML in the translation source and convert to tags so that when I attempt to match against TM that does not have the same embedded HTML and that I get a better match? I currently can import the XLIFF with no problem and it works great. It is the additional embedded HTML content that is giving me fits.

Here is the example.

I am importing using the standard XLIFF File type provided SDL.

The text looks like the following in the XLIFF file and shows up as HTML in SDL. It does not show up as tags in SDL.

(FOR THIS POST ONLY, I HAD REPLACE ALL '' characters with '-' so that this post doesn't show the HTML equivalent of my data. To recreate what my issues is you will need to replace '-' with '')

-source--div--strong-Was situation resolved?-/strong--/div--/source-

It looks like the following in SDL

-div--strong-Was situation resolved?-/strong--/div-

If I extract the text '-div-strong-Was situation resolved?-/strong>-/div-' directly from the HTML source and open in SDL from a file with a .html extension then SDL immediately recognizes the HTML and does not display in the Editor window. It only displays 'Was situation resolved?' and the TM match is 100%.

It is almost like I need to first import the XLIFF file into SDL and then run another pass using the SDL File Type HTML. My research has suggested that I need to create a new File Type using XML Generic and then add the logic for XLIFF and HTML together. Does this sound right. Could anyone provide an example or I would be ever grateful if someone could even send me a private message with an export of that config.

Thanks for any suggestions!!



[Edited at 2011-08-26 15:35 GMT]

[Edited at 2011-08-26 15:35 GMT]

[Edited at 2011-08-26 15:36 GMT]

[Edited at 2011-08-26 15:36 GMT]

[Edited at 2011-08-26 15:37 GMT]

[Edited at 2011-08-26 15:39 GMT]

[Edited at 2011-08-26 15:42 GMT]

[Edited at 2011-08-26 15:43 GMT]

[Edited at 2011-08-26 15:44 GMT]
Collapse


 

Epameinondas Soufleros  Identity Verified
Greece
Local time: 05:59
Member (2008)
English to Greek
+ ...
memoQ 5.0 Aug 26, 2011

Download memoQ 5.0 Release Candidate, install it, and have all the Regex Tagging fun a translator can wish for!

[Edited at 2011-08-26 16:08 GMT]


 

slickrock22
Local time: 21:59
TOPIC STARTER
Direction Aug 27, 2011

Even pointing me in the right direction would be great. For example does this goal need to be accomplished upon file import only or is it possible to do post file import into SDL? Thanks!

 

slickrock22
Local time: 21:59
TOPIC STARTER
MemoQ Aug 29, 2011

Thanks for the MemoQ RegEx tool. I just tested and it worked 100%. Great feature. Pretty astonished that SDL doesn't have something like this. Good work.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tagging/Removing Embedded HTML code in XLIFF

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search