XML files - embedded HTML code - SDL Trados 2011
Thread poster: Anna Sarah Krämer Fazendeiro

Anna Sarah Krämer Fazendeiro
Germany
Local time: 23:42
Member (2011)
English to German
+ ...
Dec 28, 2012

I am trying to analyze some XML files and lots of HTML codes appear in the files (and the respective word count).

How do I filter the HTML out so that it doesn't appear in the translatable text?

Kind regards and thank you for your time,
Anna


 

Stanislav Pokorny  Identity Verified
Czech Republic
Local time: 23:42
English to Czech
+ ...
RegEx Dec 29, 2012

Hi Anna,
you need to set up a suitable filetype for your XML in Studio. Once you have set up the XPath parser rules, go to the embedded content and add the following regular expression:

</?[a-z][a-z0-9]*[^<>]*>

This should catch most of the embedded HTML tags and exclude them from the analysis.

(I apologize for making it this short, but it's already 2.40 a.m. here.icon_smile.gif)

[Upraveno: 2012-12-29 01:41 GMT]


 

Anna Sarah Krämer Fazendeiro
Germany
Local time: 23:42
Member (2011)
English to German
+ ...
TOPIC STARTER
For dummies... Dec 29, 2012

I'm afraid I'll need the explanation "for dummies" -

What document structure information do I have to put in there?

Do I use the regex in Placeholder tag or Tag pairs?

I have tried to use different regular expressions yesterday like shown in SDL online help - the process seems to be explained well enough, I followed the steps given there, yet none of it seemed to work. What else might go wrong here?


 

Stanislav Pokorny  Identity Verified
Czech Republic
Local time: 23:42
English to Czech
+ ...
Info Dec 29, 2012

Hi Anna,
I'm sending you a PM. Please reply to my email and I will forward step-by-step instructions along with some screenshots I got from Paul Filkin from SDL. He pulled my buttocks out of the trouble with his detailed instructions on how to set up a filetype.


 

AndreasDunker
United States
Local time: 17:42
English to German
Hi Stanislav May 23, 2013

Could you pleeeease send me those instructions too? I am pulling hair I don´t have any more trying to figure this out. I´m no idiot and an experiences translator, but new to Trados and these XML settings are just too much for me.

 
Please also send me these instructions Aug 6, 2013

I have an XML file with HTML codes in it, and I don't want all of this junk to become part of my TM.

 

ameisenmann
Germany
structure information Feb 2, 2014

I also struggled some time with this issue since I found out that the "Document Structure Information" has to match the one that is defined for the element in parser rules. If you do not set any structure info for the parser rule it won't work. What exactly you choose as structure info is arbitary. It only has to match.
Also describer here:
http://producthelp.sdl.com/sdl%20trados%20studio/client_en/File_Types/Configure_EmbedCont_in_XML_Files.htm









 

ulrika månsson
Sweden
English to Swedish
+ ...
thank you, ameisenmann! Mar 28, 2014

ameisenmann wrote:

I also struggled some time with this issue since I found out that the "Document Structure Information" has to match the one that is defined for the element in parser rules. If you do not set any structure info for the parser rule it won't work. What exactly you choose as structure info is arbitary. It only has to match.


This was the final info I needed, problem solved, thank you ever so much!
/Ulrika


 

rgokalp
Turkey
Local time: 00:42
English to Turkish
stellent xml translation Apr 1, 2014

Hi,

Does anyone have experience in translating "stellent xml" with studio 2011? These files are normally detected as any-xml, they contain too many tags. I tried both adding regex rules to any-xml, and defining a new file type by adding regex rules and adjusting detection settings. None of them worked. The tags remain as plain text, and studio works so slow that even translating very small files is becoming a real pain.

Thank you very much in advance for any helpicon_smile.gif


 

SDL Community  Identity Verified
United Kingdom
Local time: 23:42
English
Most likely... Apr 1, 2014

... you are not defining the filetype correctly. The sensible process (for me) is this:

1. Create a new XML filetype first
2. Import your stellent.xml file to get the possible parser rules for that file
3. Refine your parser rules so you get only what you need
4. Add structure to the rules that contain embedded content
5. Add the regex rules so they apply to the appropriate structure

You do this in Tools -> Options and not Project Settings (at least this is the way I would do it) as it’s faster to check the filetype settings work through the Open Document command. Now create your Project and check that your new filetype is picked up.

That’s it.

If you have done all of this and it still does not work I’d be happy to take a look at this for you. You can send me an appropriate stellent.xml sample file via pfilkin@sdl.com

Regards

Paul


 

rgokalp
Turkey
Local time: 00:42
English to Turkish
It worked :) Apr 1, 2014

SDL Support wrote:

... you are not defining the filetype correctly. The sensible process (for me) is this:

1. Create a new XML filetype first
2. Import your stellent.xml file to get the possible parser rules for that file
3. Refine your parser rules so you get only what you need
4. Add structure to the rules that contain embedded content
5. Add the regex rules so they apply to the appropriate structure

You do this in Tools -> Options and not Project Settings (at least this is the way I would do it) as it’s faster to check the filetype settings work through the Open Document command. Now create your Project and check that your new filetype is picked up.

That’s it.

If you have done all of this and it still does not work I’d be happy to take a look at this for you. You can send me an appropriate stellent.xml sample file via pfilkin@sdl.com

Regards

Paul


Thank you very much Paul, it worked.
Best,


 

Henri-Axel Carlander  Identity Verified
United States
Local time: 17:42
Member (2010)
English to French
+ ...
New XML file type in Studio 2011 Mar 14, 2016

Hello there,

I am trying to create a new XML file type in Studio 2011, but the new XML type does not display in the list of extensions when I try to add a file through the option Add Files of Studio 2011. I have imported an INI file. Everything looks OK, execpt that the new file type is to be seen nowhere.
Thanks for the help,

Henri-Axel


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

XML files - embedded HTML code - SDL Trados 2011

Advanced search







memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search