Combining multiple XML files (mono-lingual) into a single XML file ( multiples languages)
Thread poster: flebelgium

flebelgium
English to Italian
Aug 9, 2010

Dear Colleague,

I would highly appreciate of your advise on how to combine multiple XML files ( consists of 4 languages per file) into a single XML file ( 4 languages into a single XML file).

Your kind assistance to give some pointers to perform this task is highly appreciated.

Thank you.

Best regards,

HK Lim


 

Adam Łobatiuk  Identity Verified
Poland
Local time: 14:01
Member (2009)
English to Polish
+ ...
Text editor Aug 9, 2010

I'm not sure what exactly you want to achieve here. First of all, XML files are just text files that you can edit in any text editor, like Notepad. So, you could copy the content of your successive files and paste it all into one file (e.g. the first one). However...

XML files must begin with an XML declaration like
< ?xml version="1.0" encoding="UTF-8"? >

a few optional lines follow, and then there's the root tag that could be called anything, for example < document >, and this tag also ends the entire file < /document >.

If you just merge together the entire content of your files, the XML file will be unusable, because it can only contain one declaration and one pair of root tags. That is why you should probably copy and paste the content between the root tags.

If you have SDL Trados 2007 or 2006, it has a utility called Trados Glue, which merges files, including XML into TTX files.


 

flebelgium
English to Italian
TOPIC STARTER
Merging Multiples XML files into Single XML file Aug 9, 2010

Hello Adam,

Many thanks for your kind advise. I've tried the method that you suggested ie. using SDL Trados Glue but it doesn't achieved the desire result.

The purpose of merging the multiple XML files ( basically one language per XML file) into a single XML ( consists of 1 source language and 4 target languages ). After merging the multiple XML files, 5 files in total, the desire result will be as follows :

-
13895
Sub Gear
Sub-Getriebe
Garniture secondaire
Ingranaggio secondario
Sub -engranaje


As you can above that the first line is source language (EN) and subsequent 4 lines is in the target languages (ie. German,French,Italian and Spanish)

Would you mind to help me in order to achieve the result ?

Thank you in advance.

Best regards,

HK Lim


 

opolt  Identity Verified
Germany
Local time: 14:01
English to German
+ ...
XML is not XML Aug 9, 2010

flebelgium, as Adam said, it would really be helpful if you explained to us in more detail what it is that you actually want to achieve. It is also important to know which XML dialect these files are based on, which DTD's, schemas or whatever define their structure, etc.

In general, you can join several XML files without much ado, because on the lower level, as Adam said, they are just text files.

However, if you want to keep them intact as valid XML, most likely as the input of some documentation system (which appears to be your case), it is not advisable to merge those files, just like that. Many XML dialects don't require you to merge files in order to merge sections anyway; that is something normally achieved by the engine in the background in combination with the file syntax, and/or some file containing metadata about the structure of the complete document. If you do it by hand, most likely there's going to be trouble ahead.

XML files, in documentation/presentation, are normally just the input of a some type of engine producing different types of human readable output: XHTML (web pages), PDF, RTF, whatever. XML mostly follows a very strict (and in some cases hugely complex) syntax; so you either have to find a tool which is able to merge the specific type of files you're dealing with, or you have to learn the syntax of that specific XML dialect, which is not a thing you'll be able to do in one day.

And when dealing with different languages, there are also some encoding/locale issues to be taken into account. For instance, even if you succeeded in merging the files, in the way adam suggested, depending on the XML dialect you might also have to add the "xml:lang" attribute correctly for each section of the file; if not the system might produce garbled characters, or do other inexpected things.

So, in short, don't do it.


 

flebelgium
English to Italian
TOPIC STARTER
Merging Multiple XML (mono-lingual) into Single XML ( multi-lingua) Aug 10, 2010

Hello,

Noted your comment with thanks.

Basically, the main objective to merge multiple XML files into a single XML file is to import the single XML ( consists of 1 source language with 4 target languages) into database software ie. Ms Access.

The DTD, defining the structure, was defined with Tag Setting in Trados. If I'm not wrong that it is possible to use Trados TagEditor to merge multiple XML files into a single XML file to achieve the objective outline by me above.

It is highly appreciated of your kind assistance to assist me.

Thank you in advance.

Best regards,

HK Lim


 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 19:01
Member (2004)
English to Thai
+ ...
SdlTradosGlue.exe Aug 11, 2010

In SDL Trados 2007, folder C:\Program Files\SDL International\T2007\TT has the utility above to merge/split XML files easily.

Best regards,
Soonthon L.

[Edited at 2010-08-11 04:06 GMT]


 

flebelgium
English to Italian
TOPIC STARTER
Merging Multiple XML (mono-lingual) into Single XML ( multi-lingua) Aug 11, 2010

Hello Lupkitaro,

Noted your helps with thanks. Unfortunately, when I merge the mono-lingual XML file using 5 files ( 1 source and 4 target language), the result that I got is a mono-lingual ttx file, ie. it's the first target language only as shown below :

13895
Sub-Getriebe

This is not the result desired by me as shown below :

13895
Sub Gear
Sub-Getriebe
Garniture secondaire
Ingranaggio secondario
Sub -engranaje

Any good advise to overcome this problem and achieve the desire as outline above.

Thank you in advance.

best regards,

HK Lim


 

Daniel García
English to Spanish
+ ...
SDL Trados glue will not do the job Aug 12, 2010

With the SDL Trados glue tool, you can merge several XML files into one TTX file but not into one XML.

You could do the job manually by opening all the XML files with a text editor and then copying and pasing each single sentence with the adequate tags together.

To do this you would need first to learn something about XML (lots of useful pages in the Internet) and how it works in order to make sure that the merged XML file is valid.

This can be also done automatically by somebody who knows XML programming.

You might be better off outsourcing this merge task to a person who knows what their doing.

The question is: why do you want a merged XML files? Has your customer asked for it? How will this merged XML file be used? What's the purpose? There might be simple ways of achieving your purpose.

If the customer asked for it, haven't they given you specific instructions or at least detailed information of what they expect to obtain from you by merging the files?

Daniel


 

Achim Herrmann
Local time: 14:01
English to German
Depends on the structure Aug 19, 2010

Hello,

I think this strongly depends on the XML structure. It would be helpful to get a small sample with some dummy text in it.

I think it may be possible to achieve this goal using a software localization tool that support multilingual XML files.

Hope this helps
Achim Herrmann


 

FarkasAndras
Local time: 14:01
English to Hungarian
+ ...
I would... Aug 19, 2010

... try to extract the relevant content from each file first (i.e. convert all files to tab delimited txt files, one row per record), and then put it all in a tab delimited file/spreadsheet and call it a day.
Given that we have no idea about the structure of the XML files or how similar the structure is across the 5 files, it's impossible to tell how difficult the task is, and it is impossible to suggest a foolproof solution or software.
If the xml structure is simple, the whole job would take me 5 minutes.
If it's complex and the 5 files are very different from each other, I may not be able to do it at all.

Are the files large? If they are no longer than a couple of thousand segments, you can get away with common tools like MS Office. It's just a matter of coming up with the right search and replace operation in Word, really. If they in the tens or hundreds of thousands of segments, you'll have to use something like sed or Perl, which take a bit of getting used to.

Note: I know nothing about importing XML files into Access, but I can't help but think that if you expect to be able to import a 5-language XML to Across, then you should be able to import the 5 files one by one and populate the same database with them. Am I being crazy here?

[Edited at 2010-08-19 20:05 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Combining multiple XML files (mono-lingual) into a single XML file ( multiples languages)

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search