Cleaning TM in Wordfast Pro
Thread poster: IWagner

IWagner  Identity Verified
Local time: 06:54
English to French
Nov 24, 2011

Can somebody explain to me how to clean a large TM in Wordfast Pro? (or tell me where I can find this information)
Thanks in advance!


 

Yasmin Moslem  Identity Verified
Egypt
Local time: 12:54
English to Arabic
Details Nov 25, 2011

Could you please elaborate slightly on what you mean by "cleaning a TM".

Many thanks in advance!
Yasmin


 

IWagner  Identity Verified
Local time: 06:54
English to French
TOPIC STARTER
'Clean" TM Nov 25, 2011

I mean remove whatever is not needed (e.g., duplicates, etc. I have one which is becoming too big and I know there are duplicates. I could do it manually under "Text Edit", but it would be very time consuming.. Under Wordfast Classic, I know there are ways to do it. I don't know with WF Pro.

 

Oliver Walter  Identity Verified
United Kingdom
Local time: 11:54
Member (2005)
German to English
+ ...
Use Excel Nov 25, 2011

The most useful tool I have found for editing a Wordfast TM is a spreadsheet, and Excel in particular. According to http://www.wordfast.com/support_specifications.html, the format of the TM is the same for WF Classic and WF Pro (I use WF Classic).
You can open the TM text file (file name with extension .txt) in Excel; the fields of each line of the TM are separated by Tab characters, which will put them into different spreadsheet columns. After editing, save the TM file as Tab-separated text and use WF to reorganise it (this will update the count of Translation Units in the 3rd field of the first line).
Useful actions you can do with the spreadsheet include:
  • Select the whole file but not the first line (select columns A to H - possibly more columns depending on the TM's attributes; it won't do any harm to select more columns than those that actually contain data);
  • Sort the rows according to a column depending on your criterion: column A puts them in date order, but collects together the rows marked for delete (they begin with "x"). Column C shows how many times the TU (Transl. Unit) has been re-used (I'm not sure exactly what re-use means here!); Column E is the source text, so sorting can reveal duplicates. Column G is the target text. It might be worth sorting on Col. E and then G in the same sorting action.
  • After sorting you can examine the rows and possibly decide to delete some of them; then perhaps sort again based on a different column and re-examine for deletion.


Hope that helps a bit
Oliver


 

Dominique Pivard  Identity Verified
Local time: 13:54
Finnish to French
How "big" is your TM? Nov 26, 2011

IWagner wrote:
I mean remove whatever is not needed (e.g., duplicates, etc. I have one which is becoming too big and I know there are duplicates. I could do it manually under "Text Edit", but it would be very time consuming.. Under Wordfast Classic, I know there are ways to do it. I don't know with WF Pro.

How "big" (number of TU's) is your TM? In normal operations, it won't make any difference in terms of performance if your TM has 300,000 or 30,000 TU's, so making it smaller won't increase your productivity.

Duplicates are another matter, and you may indeed want to remove them. As Olivier said, the format of a Wordfast TM is the same regardless of the client (Classic, Pro, Anywhere) that was used to create it or access it. So even if the current version of Pro doesn't include a full-fledged editor, you could still use Classic's Data Editor and the relevant functions in Tools > Special filters.

Another possibility is Olifant (http://okapi.sourceforge.net/downloads.html). It's currently Windows-only, but a Java-version is in the making, which means it will be able to use it in Mac and Linux as well. If you have a Windows virtual machine on your Mac, you can use the current version of Olifant.


 

Dominique Pivard  Identity Verified
Local time: 13:54
Finnish to French
You need to be careful with Excel Nov 26, 2011

Oliver Walter wrote:
You can open the TM text file (file name with extension .txt) in Excel

You need to be careful and aware of pitfalls when opening a Wordfast TM in Excel. For instance, very long segments may get truncated, you may end up with quotes around your segments etc. Dedicated TM editors like the one built-in in Classic or Olifant should be preferred IMO.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Cleaning TM in Wordfast Pro

Advanced search


Translation news related to Wordfast





TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search