Tip: create/extract TM from source and target word files
Thread poster: Lianne van de Ven

Lianne van de Ven  Identity Verified
United States
Local time: 05:39
Member (2008)
English to Dutch
+ ...
Mar 23, 2011

I found a way to create a TM based on two regular word files, one in the source language, and the other in the target language. Maybe this is already commong knowledge, but I could not find information about it, so I decided to post it here. This tip is helpful if you want to create a TM based on files that have already been translated. The main purpose is to develop TM's, on topics, or for clients.

I came up with the idea because I prefer to proof bilingual files (that is, single f
... See more
I found a way to create a TM based on two regular word files, one in the source language, and the other in the target language. Maybe this is already commong knowledge, but I could not find information about it, so I decided to post it here. This tip is helpful if you want to create a TM based on files that have already been translated. The main purpose is to develop TM's, on topics, or for clients.

I came up with the idea because I prefer to proof bilingual files (that is, single files). To solve this, I copy the source and target files in a table in Word, so I have a column with the source and a column with the target. This works fine for proofing. Today I took it a step further and I was able to create a TM based on these two files with Wordfast Pro.

I opened each file (source and target) in Wordfast Pro and filled the target fields with the source for each of them. This results in two mono-lingual txml's. I then committed both to the same, new, 'scratch' TM (source-target language combination).

After closing both files, I opened the TM in Excel. This file has several columns (time stamp, user name, source, target etc), and, in rows down, first the group of source-source combinations, then the group of target-target combinations.
I copied the cells with the target segments into the 'target' cells of the source-source entries, resulting in source-target combinations. Then I removed the target-target entries, and saved as the file with a new name.

Using this a the TM, I was able to fill a new txml correctly with the target segments, so my TM works! For some reason one segment was skipped in the target-target version, so I did have to move the target column one cell down, but all other content was segmented identically. I can now proof the text in txml.

Maybe there is an easier way to do this? Maybe it's not all that useful?
I am not sure, but I liked 'my new trick' so I thought I'd post it.
Collapse


 

Heinrich Pesch  Identity Verified
Finland
Local time: 12:39
Member (2003)
Finnish to German
+ ...
We call it aligning Mar 23, 2011

There are special tools for it like Winalign.

 

Lianne van de Ven  Identity Verified
United States
Local time: 05:39
Member (2008)
English to Dutch
+ ...
TOPIC STARTER
Thank you Mar 23, 2011

Heinrich Pesch wrote:

There are special tools for it like Winalign.


Thank you for enlightening me. Who would have guessed?
I am in need of a workshop, I think. Will look into 'aligning'.
Anyway, this playing around helps to get to know software too.


 

Alex Lago  Identity Verified
Spain
Local time: 11:39
Member (2009)
English to Spanish
+ ...
Wordfast has PlusTools for this Mar 24, 2011

As you posted this on the Wordfast support forum and you say you used Wordfast Pro I take it you have the latest version 2.4.1, I'm just asking because since version 2.3 it includes Wordfast Aligner which can be used for aligning files and creating TMs

Also if you have Wordfast Classic you should also install PlusTools 4 (on the same download page as Wrdfast) which is a macro with tools for Wordfast classic and one of them is an aligner.

So both Classic and Pro have too
... See more
As you posted this on the Wordfast support forum and you say you used Wordfast Pro I take it you have the latest version 2.4.1, I'm just asking because since version 2.3 it includes Wordfast Aligner which can be used for aligning files and creating TMs

Also if you have Wordfast Classic you should also install PlusTools 4 (on the same download page as Wrdfast) which is a macro with tools for Wordfast classic and one of them is an aligner.

So both Classic and Pro have tools to align files and then create TMs

[Edited at 2011-03-24 00:04 GMT]
Collapse


 

Daina Jauntirans  Identity Verified
Local time: 04:39
German to English
+ ...
Still... Mar 31, 2011

You never know when the "normal" way of doing things will run into glitches, and in that case, it's nice to have a back-up. Then again, this is a week where I have had to have a TM converted to an Excel table and glossary and a glossary converted into a TM to even be able to view them in Pro, so... I think it's quite ingenious that you figured out this workaround.

 

NMR (X)
France
Local time: 11:39
French to Dutch
+ ...
There are no glitches Mar 31, 2011

Daina Jauntirans wrote:

You never know when the "normal" way of doing things will run into glitches, and in that case, it's nice to have a back-up. Then again, this is a week where I have had to have a TM converted to an Excel table and glossary and a glossary converted into a TM to even be able to view them in Pro, so... I think it's quite ingenious that you figured out this workaround.


The created TM is a tab delimited text file, so that you can always read it with every software you like. No need to make other backups than the regular ones on an external disk.

Reading a TM in Excel and manipulate on it there can be a bit dangerous because sometimes Excel transforms cells (for instance dates), and other cells will become like this #####
But I also use this method when transforming a TM into a glossary, not for backup reasons but in order to have autopropagated text chunks.


 

LFNEI
India
Dear All, With the help of this discussion I have created some TM's and need suggestion for a issue. Jan 19, 2016

I am preparing TM's using +tools in Wordfast (keeping the Wordfast segmentation). But the Header of this TM in txt is different then the usual TM's that I have created during other jobs.

1. Usual TM

%20140308~172629 %M (MEDHI) %TU=00000428 %EN-US %Wordfast TM v.6.03t/00 %AS-IN %---
20140308~172803 M 0 EN-US Q1. Centres AS-IN Q1. Centres EL ST
20140308~172815 M 0 EN-US Q1. Centre AS-IN Q1. চেণ্টাৰ EL

2. Usual TM

%
... See more
I am preparing TM's using +tools in Wordfast (keeping the Wordfast segmentation). But the Header of this TM in txt is different then the usual TM's that I have created during other jobs.

1. Usual TM

%20140308~172629 %M (MEDHI) %TU=00000428 %EN-US %Wordfast TM v.6.03t/00 %AS-IN %---
20140308~172803 M 0 EN-US Q1. Centres AS-IN Q1. Centres EL ST
20140308~172815 M 0 EN-US Q1. Centre AS-IN Q1. চেণ্টাৰ EL

2. Usual TM

%20151217~091810 %U (User) %TU=00000418 %EN-US %Wordfast TM v.6.03t/00 %AS-IN %---
20151217~091938 U 0 EN-US EL ST
20151217~092014 U 0 EN-US Q2. QC AS-IN Q2. QC EL ST

3. TM Created by extract, align and create TM process using +tools

%20160119~120101 %+A! %TU=00000000 %EN-US %Wordfast translation memory version v.5 %AS-IN %This is a header - do not delete, move or sort.
20160119~120102 +A! 0 EN-US Q. In your business setup, are you one of the key decision makers for purchase of files? AS-IN Q4.


I am using demo version of Wordfast and Windows 8 (MS word 2007) in Lenovo laptop. My Language code is EN-US to AS-IN. I am saving the files as Unicode encoding txt.


Please suggest where I am making mistakes in creating the TM.
Collapse


 

Lianne van de Ven  Identity Verified
United States
Local time: 05:39
Member (2008)
English to Dutch
+ ...
TOPIC STARTER
Problem not clear Jan 20, 2016

LFNEI wrote:

I am preparing TM's using +tools in Wordfast (keeping the Wordfast segmentation). But the Header of this TM in txt is different then the usual TM's that I have created during other jobs.

(...)

I am using demo version of Wordfast and Windows 8 (MS word 2007) in Lenovo laptop. My Language code is EN-US to AS-IN. I am saving the files as Unicode encoding txt.

Please suggest where I am making mistakes in creating the TM.


It is unclear to me what you are asking. Can you describe the problem that you are experiencing, other than that the header is different? Headers are created automatically, and they are different for different programs.


 

LFNEI
India
Dear Lianne, Thanks for the reply. Jan 20, 2016

I have seen that the usual header of a TM contains 1. Name of the system, 2. Initials of the system, 3. Count of TU and so on....

But in the mentioned TM created using by extract, align and create TM process using +tools
1. Name of the system is +A!

2. Count of TU is %TU=00000000

3. And %This is a header - do not delete, move or sort. is written which is absent in our usual TM
... See more
I have seen that the usual header of a TM contains 1. Name of the system, 2. Initials of the system, 3. Count of TU and so on....

But in the mentioned TM created using by extract, align and create TM process using +tools
1. Name of the system is +A!

2. Count of TU is %TU=00000000

3. And %This is a header - do not delete, move or sort. is written which is absent in our usual TM headers.

My first problem is that I have to know the number of TU's this TM contains.

Could you please suggest me in this regard
Collapse


 

John Di Rico  Identity Verified
France
Local time: 11:39
Member (2006)
French to English
Open in Excel Jan 20, 2016


My first problem is that I have to know the number of TU's this TM contains.

Could you please suggest me in this regard


Hello,

Open the TM in Excel, take the number of the last row and subtract 1!

Close and do not save.

Best,

John


 

LFNEI
India
Dear John, Thanks for taking time to help me. Jan 21, 2016

I used the suggested method and got the count of the TU's.

Sorry to bother you all again but please tell me is there any way through which this count of TU is reflected in the TU count part of the TM header itself. I mean I know that there are X number of TU's in this TM. But it is actually not written in the %TU=00000000 portion.
[My senior would check the count of TU's written in %TU=00000000 ; If that is zero t
... See more
I used the suggested method and got the count of the TU's.

Sorry to bother you all again but please tell me is there any way through which this count of TU is reflected in the TU count part of the TM header itself. I mean I know that there are X number of TU's in this TM. But it is actually not written in the %TU=00000000 portion.
[My senior would check the count of TU's written in %TU=00000000 ; If that is zero than he is not going to accept it, even if it contains 498 TU's.]

Or should I try some other way so that I myself could edit the header of the TM and make it %TU=00000498 which would show the count, but I fear that might make the TM corrupt/not working.

Please help if there is any way through which we could do this.
Collapse


 

Alejandro Pelaez
Bolivia
Local time: 05:39
English to Spanish
Almost gave up with Wordfast Aligner Aug 11

I tried for more than half an hour to extract a TM from to docx documents (source and target). Here are my takeaways:

1. Does not support .docx you have yo save them as .doc (Word 97-2003).
2. Save the source and target files in different paths.
3. Wordfast Aligner sucks, but I haven't found any way easier than that.

Good Luck!


 

John Di Rico  Identity Verified
France
Local time: 11:39
Member (2006)
French to English
Other aligners Aug 12

There are other, better aligners available. Please see this article: https://www.wordfast.net/wiki/Alignment

The desktop aligner has not been updated for several years.
Best,
John


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Tip: create/extract TM from source and target word files

Advanced search


Translation news related to Wordfast





SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
BaccS – an SDL product
Modern translation business management for freelancers and agencies

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search