Merging translation memories
Thread poster: Sawaddeekha

Sawaddeekha

English to French
+ ...
Dec 13, 2012

Hello,

I manage a translation department in a multilingual company. Most of our suppliers use Trados, but we don't.
We would like to merge our supplier's databases to form a consistent and up-to-date translation memory.
Is there any way we can achieve that without using Trados ? Most of the files I've received so far are .tmx files, but I can't open them. Is it possible to export databases from Trados into Excel, for instance ?

Thank you for your assistance !


Direct link Reply with quote
 

Hermann Bruns  Identity Verified
Local time: 03:08
English to German
Try MetaTexis Dec 13, 2012

Hello Sawaddeekha,

besides TRADOS there are quite a few other CAT tools that can handle TMX files. MetaTexis is an achievable and user-friendly alternative (www.metatexis.com).

Best regards
Hermann


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 03:08
Member (2006)
English to Afrikaans
+ ...
I'm afraid not Dec 13, 2012

Sawaddeekha wrote:
Most of the files I've received so far are .tmx files, but I can't open them. Is it possible to export databases from Trados into Excel, for instance?


I'm not sure about Trados 2009/11, but Trados 2007 offers only export to TMX and to a simple type of TXT format, neither of which are tab delimited. However, you can use Okapi Olifant to merge TMs. I don't think it can merge Trados TXT memories but it certainly can merge Trados TMX memories. There are some types of TMXes that Olifant refuses to import (or imports only partially) but most of them are imported and merged with ease.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 03:08
English
Merging Translation Memories Dec 13, 2012

Hi Sawaddeekha,

If you have SDL Trados Studio then you can use this to merge many different types of translation memories together, and even with different language pairs.

I write an article you can download as a pdf from here, http://goo.gl/SWQ5d , and this explains how to use the upgrade of translation memories in Studio to merge them all together for the following formats:

sdltm : Studio TM
tmw : Trados TM
txt : Trados TM or WinAlign file
TMX : TM exchange file
mdb : SDLX TM

Perhaps this will be a good way of achieving this task?

Regards

Paul


Direct link Reply with quote
 

TransAndLoc
Spain
Local time: 03:08
Try this Dec 13, 2012

Hello,
In our translation company we are experts about translation memory issues.

A tmx file can be opened using Notepad or even better Notepad++ (recommended).

Open both tmx files using Notepad++. Copy the text from until . Do NOT include the tags nor only the text in between. Then paste this text in the second tmx file right after the tag . Save. Now you will have both translation memories in one tmx file.

Now you can use Word to clean the code.

Copy all the text and paste it in a new Word file.

1. Use find and replace tool:
In options select: wild characters
Search: \*\
Replace: (Click on format and then highlight)

2. Then you need to look for the text that is not highlighted and delete it:
In options uncheck: wild characters
Search: (click on format and highlight twice so it appears no highlighted)
Replace: ^p (Click on "no format" to delete the previous replacement settings)

3. Now delete the code: and
Search:
Replace: (leave empty)

and
Search:
Replace: (leave empty)

IMPORTANT: Click on No format button.

4. Then click on the tab called "Insert" and select the table button. Click on "Convert text into table" and select two columns.

The new table should be ready to copy and paste in Excel now.

IMPORTANT: The list before converting into table must have only one paragraph line between them. For example:

item
item
item
item
etc.


This would be wrong:

item


item
item
item


item

More than one return carriage should be deleted using the find and replace tool:
Find: ^p^p
Replace: ^p

If you have any doubt regarding this explanation. Feel free to post again.




TransAndLoc
Skype: info_transandloc
Webpage: http://www.TransAndLoc.com
Twitter: @transandloc







Direct link Reply with quote
 

xxxnrichy
France
Local time: 03:08
French to Dutch
+ ...
Use Olifant Dec 13, 2012

Use Olifant to convert these tmx files into Wordfast txt files, which are tab delimited text files (the header contains Wordfast indications).
Then just copy the lines below the header from the second file into the first txt file, save and then use Olifant again to convert into tmx.
Tmx can be used later in all CATs.


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 04:08
English to Arabic
Olifant merging TMX Dec 14, 2012

nrichy wrote:

Use Olifant to convert these tmx files into Wordfast txt files, which are tab delimited text files (the header contains Wordfast indications).
Then just copy the lines below the header from the second file into the first txt file, save and then use Olifant again to convert into tmx.



If you are going to use Okapi Olifant, you do not need such workaround. Sawaddeekha said "Most of the files I've received so far are .tmx files."

First of all, you can download Okapi Olifant at:
http://okapi.sourceforge.net/applications.html

- Open one TMX TM
- File menu > Import
- Select the other TMX TM (make sure you select "TMX Files" from the dropdown menu).
- The entries of the other TM will be added the currently open TM.
- File menu > Save.


HTH,
Yasmin


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 02:08
Member (2009)
Dutch to English
+ ...
Hearstsome's (open source) TMX editor is the fastest way to merge multiple TMXs Feb 3, 2016

https://github.com/heartsome/tmxeditor8

Direct link Reply with quote
 

FarkasAndras
Local time: 03:08
English to Hungarian
+ ...
No Feb 3, 2016

TransAndLoc wrote:

Hello,
In our translation company we are experts about translation memory issues.

A tmx file can be opened using Notepad or even better Notepad++ (recommended).

Open both tmx files using Notepad++. Copy the text from <body> until </body>. Do NOT include the tags <body> nor </body> only the text in between. Then paste this text in the second tmx file right after the tag </body>. Save. Now you will have both translation memories in one tmx file.

Yes you can do this, but I would definitely not recommend making this the standard practice. You might get mixed language codes, or languages in mixed order, or you might mess it up one time and struggle to find where you went wrong. The two tmx files might be in different encodings, and I'm not 100% sure that non-ascii characters will copy-paste correctly in all scenarios between files in different encodings. I do mess around in tmx files manually sometimes, but it's not the best practice for regular use especially for someone who doesn't know the internals that well. I have written programs that generate and read TMX files so I have a good grasp of what the format is like and can fix problems. A cryptic error message from a CAT tool after an incorrect copy-paste would stop most people dead in their tracks.
By the way, you need to use character references to get tags to show up here. I fixed part of your post to make it more intelligible. This charater reference problem that broke your post is exactly the same one that will break your TMs as well if you follow this procedure (see below). But then an expert knows this already, right?

TransAndLoc wrote:
Now you can use Word to clean the code.
...


Yes you can do this too, but you shouldn't. If you want to convert tmx to a table, there are better solutions. Quite apart from the potential for errors, this solution doesn't handle character references, so all <, >, & and quote characters will be messed up. It's also a little tedious for regular use. In short, you'd get a mess if you used this method on real-world TMX files.



To reply to the OP's question, the solution is:
1) require all translators to send TMX files in all cases. All CAT tools can import/export TMX so it shouldn't be a problem. There is no need for you to handle other formats.
2) find a tool that will merge TMX files for you. There are many options, including any CAT tool you may have.



[Edited at 2016-02-03 10:52 GMT]


Direct link Reply with quote
 

esperantisto  Identity Verified
Local time: 05:08
Member (2006)
English to Russian
+ ...
Perhaps… Feb 3, 2016

Michael Beijer wrote:

Hearstsome's (open source) TMX editor is the fastest way to merge multiple TMXs


… it is not really the fastest, as you need to work via a GUI. Fastest is always something that works in terminal. For example, SuperTMXMerge looks interesting.


Direct link Reply with quote
 

jyuan_us  Identity Verified
United States
Local time: 21:08
Member (2005)
English to Chinese
+ ...
Can anybody provide the right website to download Okapi Olifant? Apr 25, 2016

I tried to find it by Google search but all the sites I have found that contain links to download Okapi Olifant are fake ones - something else will be downloaded to your computer after you click the links, which you don't need at all.

Thank you for your help in advance.


Direct link Reply with quote
 

Arianne Farah  Identity Verified
Canada
Local time: 21:08
Member (2008)
English to French
Here you go Apr 26, 2016

jyuan_us wrote:

I tried to find it by Google search but all the sites I have found that contain links to download Okapi Olifant are fake ones - something else will be downloaded to your computer after you click the links, which you don't need at all.

Thank you for your help in advance.


http://okapi.sourceforge.net/downloads.html


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Merging translation memories

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search