Need advice on creating a TM based on two existing XML files.
Thread poster: Florian_B
Florian_B
Germany
Local time: 03:15
English to German
+ ...
Oct 28, 2013

I am trying to align two XML files that are structured in the following way to create a TM for Trados 2014 (see code box below)

however, I can not figure out how I do that.

Could someone point me towards where I could read up on this ?

Bascially I am hoping, the aligner can read the " string name="lottery_first_prize" " part and assign the contents of the same string from the source document to the target language document and then create a TM from it.


(please quote my post to see the actual code, i dont know why it wont display the code)

source
Code:
一等奖:
二等奖:
三等奖:
四等奖:
五等奖:




(already translated target)

Code:
1. Platz: 
2. Platz:
3. Platz:
4. Platz:
5. Platz:



Direct link Reply with quote
 
FarkasAndras
Local time: 03:15
English to Hungarian
+ ...
Is this all there is to it? Oct 28, 2013

It's possible that WinAlign or other aligners will handle this xml files correctly. I personally prefer to handle files of this sort manually because an aligner could cause more trouble than it's worth (if it tries to sentence segment the strings or "fix" their alignment).

Is this file made up of just strings in <string name> tags and nothing else?
If it is, you could do something like this:
Open the file in a text editor that can do regex search&replace, such as Notepad++.
Remove the header & footer.
Replace all line breaks with a space.
Replace <string name=[^>]*> with a line break (\n).
Replace <.string> with nothing.
Replace &lt; with < and &gt; with >
Search for & in the file to make sure there are no character references you need to convert.

You should get a plain text file with just the text itself, one string per row. Do the same with the other file, copy-paste the two files into adjacent columns in an Excel file and make sure the strings are correctly paired. Then either generate a TMX file out of the excel sheet or save it as a tab separated text file and import it into Trados as a bilingual file.
Of course this presupposes that both files have the exact same strings in the exact same order. If that's not the case, the xml files need to be parsed and the strings paired up based on the string ID (string name). That's not too difficult, I could do it but it takes more than search and replace.

[Edited at 2013-10-28 08:45 GMT]


Direct link Reply with quote
 
Florian_B
Germany
Local time: 03:15
English to German
+ ...
TOPIC STARTER
worked! Oct 28, 2013


Replace ]*> with a line break (\n).



this was what I didnt know before... thanks.

I did it the way you suggested.

luckily the file was perferctly paired up!

thank you very much


Direct link Reply with quote
 
FarkasAndras
Local time: 03:15
English to Hungarian
+ ...
welcome Oct 28, 2013

Glad it worked.
BTW [] stands for "any of the characters within the brackets"; [^] stands for "anything but the characters within the brackets" and * stands for "any number of the previous thingie". So [^>]*> matches "any number of characters that can be anything except >, followed by a >".

[Edited at 2013-10-28 15:41 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Need advice on creating a TM based on two existing XML files.

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search