XML files - embedded HTML code - SDL Trados 2011 Thread poster: Anna Sarah Krämer
|
I am trying to analyze some XML files and lots of HTML codes appear in the files (and the respective word count). How do I filter the HTML out so that it doesn't appear in the translatable text? Kind regards and thank you for your time, Anna | | |
Hi Anna, you need to set up a suitable filetype for your XML in Studio. Once you have set up the XPath parser rules, go to the embedded content and add the following regular expression: </?[a-z][a-z0-9]*[^<>]*> This should catch most of the embedded HTML tags and exclude them from the analysis. (I apologize for making it this short, but it's already 2.40 a.m. here.)
[... See more Hi Anna, you need to set up a suitable filetype for your XML in Studio. Once you have set up the XPath parser rules, go to the embedded content and add the following regular expression: </?[a-z][a-z0-9]*[^<>]*> This should catch most of the embedded HTML tags and exclude them from the analysis. (I apologize for making it this short, but it's already 2.40 a.m. here.)
[Upraveno: 2012-12-29 01:41 GMT] ▲ Collapse | | |
Anna Sarah Krämer Germany Local time: 15:06 Member (2011) English to German + ... TOPIC STARTER For dummies... | Dec 29, 2012 |
I'm afraid I'll need the explanation "for dummies" - What document structure information do I have to put in there? Do I use the regex in Placeholder tag or Tag pairs? I have tried to use different regular expressions yesterday like shown in SDL online help - the process seems to be explained well enough, I followed the steps given there, yet none of it seemed to work. What else might go wrong here? | | |
Hi Anna, I'm sending you a PM. Please reply to my email and I will forward step-by-step instructions along with some screenshots I got from Paul Filkin from SDL. He pulled my buttocks out of the trouble with his detailed instructions on how to set up a filetype. | |
|
|
Hi Stanislav | May 23, 2013 |
Could you pleeeease send me those instructions too? I am pulling hair I don´t have any more trying to figure this out. I´m no idiot and an experiences translator, but new to Trados and these XML settings are just too much for me. | | |
Please also send me these instructions | Aug 6, 2013 |
I have an XML file with HTML codes in it, and I don't want all of this junk to become part of my TM. | | |
structure information | Feb 2, 2014 |
I also struggled some time with this issue since I found out that the "Document Structure Information" has to match the one that is defined for the element in parser rules. If you do not set any structure info for the parser rule it won't work. What exactly you choose as structure info is arbitary. It only has to match. Also describer here:... See more | | |
ulrika månsson (X) Sweden English to Swedish + ... thank you, ameisenmann! | Mar 28, 2014 |
ameisenmann wrote: I also struggled some time with this issue since I found out that the "Document Structure Information" has to match the one that is defined for the element in parser rules. If you do not set any structure info for the parser rule it won't work. What exactly you choose as structure info is arbitary. It only has to match. This was the final info I needed, problem solved, thank you ever so much! /Ulrika | |
|
|
rgokalp Türkiye Local time: 16:06 English to Turkish stellent xml translation | Apr 1, 2014 |
Hi, Does anyone have experience in translating "stellent xml" with studio 2011? These files are normally detected as any-xml, they contain too many tags. I tried both adding regex rules to any-xml, and defining a new file type by adding regex rules and adjusting detection settings. None of them worked. The tags remain as plain text, and studio works so slow that even translating very small files is becoming a real pain. Thank you very much in advance for any help... See more Hi, Does anyone have experience in translating "stellent xml" with studio 2011? These files are normally detected as any-xml, they contain too many tags. I tried both adding regex rules to any-xml, and defining a new file type by adding regex rules and adjusting detection settings. None of them worked. The tags remain as plain text, and studio works so slow that even translating very small files is becoming a real pain. Thank you very much in advance for any help ▲ Collapse | | |
Most likely... | Apr 1, 2014 |
... you are not defining the filetype correctly. The sensible process (for me) is this: 1. Create a new XML filetype first 2. Import your stellent.xml file to get the possible parser rules for that file 3. Refine your parser rules so you get only what you need 4. Add structure to the rules that contain embedded content 5. Add the regex rules so they apply to the appropriate structure You do this in Tools -> Options and not Project Settings (at l... See more ... you are not defining the filetype correctly. The sensible process (for me) is this: 1. Create a new XML filetype first 2. Import your stellent.xml file to get the possible parser rules for that file 3. Refine your parser rules so you get only what you need 4. Add structure to the rules that contain embedded content 5. Add the regex rules so they apply to the appropriate structure You do this in Tools -> Options and not Project Settings (at least this is the way I would do it) as it’s faster to check the filetype settings work through the Open Document command. Now create your Project and check that your new filetype is picked up. That’s it. If you have done all of this and it still does not work I’d be happy to take a look at this for you. You can send me an appropriate stellent.xml sample file via [email protected] Regards Paul ▲ Collapse | | |
rgokalp Türkiye Local time: 16:06 English to Turkish
SDL Support wrote: ... you are not defining the filetype correctly. The sensible process (for me) is this: 1. Create a new XML filetype first 2. Import your stellent.xml file to get the possible parser rules for that file 3. Refine your parser rules so you get only what you need 4. Add structure to the rules that contain embedded content 5. Add the regex rules so they apply to the appropriate structure You do this in Tools -> Options and not Project Settings (at least this is the way I would do it) as it’s faster to check the filetype settings work through the Open Document command. Now create your Project and check that your new filetype is picked up. That’s it. If you have done all of this and it still does not work I’d be happy to take a look at this for you. You can send me an appropriate stellent.xml sample file via [email protected] Regards Paul Thank you very much Paul, it worked. Best, | | |
New XML file type in Studio 2011 | Mar 14, 2016 |
Hello there, I am trying to create a new XML file type in Studio 2011, but the new XML type does not display in the list of extensions when I try to add a file through the option Add Files of Studio 2011. I have imported an INI file. Everything looks OK, execpt that the new file type is to be seen nowhere. Thanks for the help, Henri-Axel | | |