Removing identical entries from Trados/SDLX TMs
Thread poster: lingoneer

lingoneer  Identity Verified
Local time: 00:56
English to Finnish
+ ...
Sep 19, 2008

Dear all,

I am faced with the following dilemma:

We have received a large TM (English>Finnish) from our client in Trados 2007 format (.tmw). Besides translations of proper sentences, it also contains entries which are identical in the source language and the target language.

Example:

in English: RT6-JG-U7
in Finnish: RT6-JG-U7

These entries are mostly product codes (consisting of numbers and letters).

Is there any way to automatically remove from the TM such entries, which are identical in the source and the target language? No other entries except the identical ones should be removed. After the removal, the TM should contain no entries which are identical in the two languages. We can convert the TM into SDLX format if necessary.

Thanks for any help you can give.

Tuomas / Lingoneer


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 23:56
Member (2004)
English to Slovenian
+ ...
Maybe there's a shorter way to do it... Sep 19, 2008

... but that's how I do it:

i) export the TM to txt or tmx
ii) convert to CSV - for instance using PlusToyz
iii) import into ExCel and check A vs B (putting if(A$=B$;1;0) for instance in C)
iv) delete all lines with identical entries
v) copy what remains to a Word table
vi) use PlusToyz to convert the table to bilingual doc
vii) import the file into a new TM

ii-v can of course be implemented in Word as well, I just dont have enough patience to figure it out n Word.

The alternative is TWB/Files/Maintainance - for example if the entries all start on RT6 , you can globally search for them ("RT6*" in source and the same in the target) and then throw them all out.

The task is worth placing it onto SDL - ideas site (will do it right now).

regards

Vito


Direct link Reply with quote
 

Jorge Payan  Identity Verified
United States
Local time: 17:56
Member (2002)
German to Spanish
+ ...
Olifant should do the trick! Sep 19, 2008

Memory management is not one of TRADOS strong points...among others.

Olfant is lightweight and free. You can download it from http://www.translate.com/technology/tools/olifant/OlifantInstaller.zip

It seems you have to use the TRADOS export file (.txt) as Olifant cannot work on the .tmw file directly.

Enjoy!


Direct link Reply with quote
 

lingoneer  Identity Verified
Local time: 00:56
English to Finnish
+ ...
TOPIC STARTER
Commercial editors available Sep 22, 2008

Hi all,

And thanks for your comments.

It seems that there are some commercially available TMX editors that could do the trick automatically. I tried Heartsome TMX Editor, and it seems to feature an option that does the trick of removing this type of duplicate entries at one mouse click (Tasks > Remove Rows with Same Text in all Columns). Most helpful for large TMs (+200,000 entries) with a large percentage of entries to be removed.

I'm not sure if Olifant has this feature but will check this.

Tuomas / Lingoneer


Direct link Reply with quote
 

Fabio Descalzi  Identity Verified
Uruguay
Local time: 19:56
Member (2004)
German to Spanish
+ ...
Moving this thread... Sep 22, 2008

... to SDL Trados forum

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Removing identical entries from Trados/SDLX TMs

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search