Mobile menu

create a TM from a translated file not with Wordfast
Thread poster: Tetyana Lavrenchuk
Tetyana Lavrenchuk  Identity Verified
Italy
Local time: 21:43
Italian to Russian
+ ...
Dec 24, 2008

Dear all,

can someone help me to create a TM from an already translated file? I mean I have an original version and a translated file ( which was not transled by means of any translation tool but simply in Word) and I need them either to be added to an existing TM or create a new one. How can I do it? Thank you so much!
Tetyana

Another question probably not related to the issue. How can I edit an existing TM? Can I edit in Word or only Wordpad file?


[Edited at 2008-12-24 11:16 GMT]


Direct link Reply with quote
 

Lori Cirefice  Identity Verified
France
Local time: 21:43
French to English
Alignment Dec 24, 2008

You need to align the 2 files, check the instructions for Plustools on how to do this.

Direct link Reply with quote
 

Epameinondas Soufleros  Identity Verified
Greece
Local time: 22:43
Member (2008)
English to Greek
+ ...
It's easy with PlusTools Dec 24, 2008

Hello,

You need to download the free PlusTools from the Wordfast site, and use the +Align tool.

You open the two files in Word and then the tool aligns them, i.e. creates a table in Word with the source text on the left column and the target text on the right column. If the segments do not match up exactly, there are handy shortcuts to manually adjust the alignment. After that process, i.e. when your ST units and your TT units are side by side in the table, you can create the TM (basically it converts the file to a tab-delimited .txt and adds a header with info such as source and target language.

After that, you can use Wordfast to merge this TM with another one, if you want.

Best regards,
Epameinondas Soufleros


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 22:43
Member (2006)
English to Russian
+ ...
To edit a TM, Dec 24, 2008

you can use any text editor capable of processing Unicode files. My choice (for other text editing tasks as well) is jEdit (www.jedit.org). As for word processors, it’s generally not a good idea to use such programs, because they tend to apply different procedures of automatic formatting, which may result in a totally messed and corrupted TM. In any case, you must clearly know what you’re going to do and understand the TM format. Refer to the respective section of the user manual.

Direct link Reply with quote
 

Fernando Guimaraes  Identity Verified
Portugal
Local time: 20:43
German to Portuguese
+ ...
Other Dec 24, 2008

Other free tool is bitext2tmx

http://sourceforge.net/projects/bitext2tmx/


Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 02:43
Partial member (2004)
English to Thai
+ ...
TM tags Dec 25, 2008

To edit TM, you must avoid chaining the TM tags.
TM can be in *.txt or *.tmx or other formats. It is best to use text editors but you must understand about specific codings in TM or you will destroy the entire TM.
[This is why Word is not good for this purpose.]

Soonthon L.


Direct link Reply with quote
 
Tetyana Lavrenchuk  Identity Verified
Italy
Local time: 21:43
Italian to Russian
+ ...
TOPIC STARTER
cannot install on Word 2007 Dec 25, 2008

Lori Cirefice wrote:

You need to align the 2 files, check the instructions for Plustools on how to do this.


I've tried to install +Tolls but as I have the Russian version of Word 2007 I cannot install, because the technical expalnation on how to install this tool on the Wordfast site is probably related to the earlier versions of Word.


Direct link Reply with quote
 

Lori Cirefice  Identity Verified
France
Local time: 21:43
French to English
Word 2007 Dec 25, 2008

http://www.proz.com/prozwiki/en:Wordfast_how_to_Word_2007_-_Install_PlusTools

Try these instructions - yes, the procedure is slightly different for Word 2007, but it does work, I am using Word 2007 as well.


Direct link Reply with quote
 
Tetyana Lavrenchuk  Identity Verified
Italy
Local time: 21:43
Italian to Russian
+ ...
TOPIC STARTER
step by step procedure Dec 27, 2008

Lori Cirefice wrote:

http://www.proz.com/prozwiki/en:Wordfast_how_to_Word_2007_-_Install_PlusTools

Try these instructions - yes, the procedure is slightly different for Word 2007, but it does work, I am using Word 2007 as well.


Excuse me for so much questions but I still cannot install the +Tools.

I've seen an article about the instalation of +Tools in Word 2007 but I think that there some step is missing in its istallation, the one which is logic for an advanced Word user but not a basic one like me.
When I download files from the Wordfast site, there are two files, plustools.doc and plustools.dot. Do I need to save the plustools.dot as a Word's add-in? I've added it like a Word's add-in but it saves it like .docx format... and wnen I press ALT+F2 nothing happens..

I feel really like an idiot...


Direct link Reply with quote
 
FarkasAndras
Local time: 21:43
English to Hungarian
+ ...
Article on alignment Dec 28, 2008

I can't help you with installing Plustools but I can give you a better option instead.
I just wrote an article on alignment. I submitted it at proz an hour ago so it's not up yet but the I uploaded the article itself and a software package to:

http://www.mediafire.com/?gtbwgjktzjf

Hunalign and the method described there is not very user-friendly (none of the software tools has a gui) but it is far superior to any other option that I know of
(Plustools, Winalign, Europarl aligner or fully manual alignment, and probably better than the vanilla aligner as well, which I haven't researched).

Istalling Plustools anyway may not be a bad idea, it has a couple of nice functions.

[Darbinieka vai moderatora rediģēts virsraksts 2008-12-28 20:42 GMT]

[Darbinieka vai moderatora rediģēts virsraksts 2008-12-28 20:43 GMT]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:43
Member (2006)
English to Afrikaans
+ ...
Alignment article Dec 29, 2008

FarkasAndras wrote:
I just wrote an article on alignment.
http://www.mediafire.com/?gtbwgjktzjf


Why is the article 4.1 MB large, Farkas?

Hunalign and the method described there is not very user-friendly (none of the software tools has a gui) but...


To me, the GUI is the second most important part of the aligner. If the GUI is unfriendly, using the program will be frustrating. Alignment is not an enjoyable exercise IMO, and if the GUI (or lack thereof) makes it difficult or slow to use, then the program has failed in one of its main functions, as far as I'm concerned.

Istalling Plustools anyway may not be a bad idea, it has a couple of nice functions.


PlusTools has only the most basic functions that one would expect any aligner to have -- there are no additional functions in PlusTools. Which "nice functions" are you referring to?


Direct link Reply with quote
 
FarkasAndras
Local time: 21:43
English to Hungarian
+ ...
Answers to the various issues raised by the previous poster Dec 29, 2008

The article is just a small MS Word file, but the zip package contains all the software needed, including a 300000 word English-Hungarian dictionary that comes with Hunalign, and a few bits and pieces I added to make it all easier.

If somebody wants it, I can upload the word file on its own.

The idea was to publish the article here and reference the zip package, but proz reviews articles so it's not up yet. I just dropped the link here because it's relevant here.


Surely, you of all people don't care about a GUI...??? The whole idea of Hunalign is that it does an amazingly good automatic alignment without you wasting 10 hours of your life pairing sentences through a GUI one by one. It uses a dictionary and the length of segments and makes VERY good guesses about pairing. You can set it to discard TUs that seem to be a poor match (high probability of bad pairing) and use the raw output straight away, or you can review the output.
It was designed to be used for building large parallel corpora. No GUI, you run it through command line, but it's powerful. It's sort of industrial:)


I don't see Plustools as an alignment tool at all. I meant other useful features like search and replace in various documents at the same time (be careful though, a couple of times it corrupted my documents and I had to use a backup I luckily had), word count on multiple documents and batch conversion to txt, which happened to come in handy for a big alignment project. It can do lots of other things too, which I have no use for.

[Edited at 2008-12-29 11:57 GMT]

[Edited at 2008-12-30 11:51 GMT]


Direct link Reply with quote
 
FarkasAndras
Local time: 21:43
English to Hungarian
+ ...
It's up at long last Jan 4, 2009

http://www.proz.com/translation-articles/articles/2176/1/Aligning-texts-with-Hunalign


Comments/corrections welcome here or in PM.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:43
Member (2006)
English to Afrikaans
+ ...
Answers to Farkas Jan 4, 2009

FarkasAndras wrote:
The whole idea of Hunalign is that it does an amazingly good automatic alignment without you wasting 10 hours of your life pairing sentences through a GUI one by one. ... It was designed to be used for building large parallel corpora.


I think one must distinguish two types of alignments, or rather two different needs, or perhaps even two different purposes.

For large parallel corpora, a mismatch of 5% would be quite acceptable. Parallel corpora are rarely used for translation memory, although they can be used in TM programs for bilingual concordance search. Such files contain thousands or even millions of segments and it is unrealistic to review the alignment one segment at a time.

If an aligned bitext is to be used as a TM, reviewing the result one segment at a time becomes more crucial. For these small or medium sized bitexts even 1% mismatch would be grossly unacceptable. Doing such a review is time consuming, and therefore it is important that the alignment program is easy to use and enables the translator to work quickly where possible. A well designed GUI is part of this.

I don't see Plustools as an alignment tool at all. I meant other useful features like search and replace in various documents at the same time ..., word count on multiple documents and batch conversion to txt, which happened to come in handy for a big alignment project.


PlusTools is not an aligner. PlusTools *contains* an aligner, along with other tools. The functions you mention should not be seen as part of the aligner.

What the PlusTools aligner lacks is a highly customisable segmenter. However, Wordfast has a very good segmenter and you can use Wordfast for segmenting the files if the build-in segmenter in PlusTools' aligner is insufficient. In my language pair, the PlusTools aligner's segmeter is good enough.

The original question related to a number of documents to be aligned for the purpose of translation memory. The amount of text in this query was small (probably less than 10 000 segments) and the degree of accuracy required was extremely high.


Direct link Reply with quote
 
FarkasAndras
Local time: 21:43
English to Hungarian
+ ...
answers Jan 4, 2009

Well yes, there are various types of alignment projects, some requiring more precision than others.

But I think (know, actually) that there are TM building situations where 1 or 2% mismatch is perfectly acceptable. You can always go back to the aligned bitext and find the correct match there in case concordance spits out an incorrect sentence pair. If you can reasonably expect that to become necessary 0 to 5 times during translation and ensuring 100% alignment would take several hours, the choice is obvious.
Actually, the fast and fairly reliable automatic alignment may convince one to align two large texts that one would never think of aligning otherwise because of the disproportionate time investment needed. You can align 200,000 TUs with 95%+ precision in under 30 minutes and use the automatic output straight away as a reference TM.
Even if you do need 100% precision I feel Hunalign is the right tool.

The "GUI" I use and propose is Excel... it is a fairly efficient tool for correcting an already decent alignment. Two columns side by side, you scroll down and stop at incorrect pairs. Really, there is no other way of doing it anyway. One could develop a couple of simple macros to make the process even faster if needed. If you insist on using some GUI aligner you know, you could probably find a way of passing the output of Hunalign in there. I just don't see the need as Excel is pretty good.
I repeat, I think that the exponentially better automatic alignment (95-99+% correct pairs can be reasonably expected) makes the method worth using despite minor inconvenience.



Re: Plustools we are in complete agreement, you just misunderstood me slightly. I recommended that the OP install it for its features that might come in handy at some point for other purposes, not alignment.

For aligning a few thousand TUs with 100% accuracy, I really do think Hunalign is the best solution - provided that you have a dictionary which is pretty easy to acquire. Skim through the article if you're interested.

BTW I realized after writing the article that a CAT tool could be used for segmentation. A bilingual word doc and some clever search and replace should produce the right format for Hunalign.

[Edited at 2009-01-04 23:48 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

create a TM from a translated file not with Wordfast

Advanced search


Translation news related to Wordfast





Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
Across v6.3
Translation Toolkit and Sales Potential under One Roof

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership. The new online network for Across users assists you in exploring new sales potential and generating revenue.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs