Segmentation, parallel corpora, and importing TMs and Glossaries
Thread poster: Souni
Souni
Local time: 15:45
German to English
+ ...
Dec 29, 2009

Hello!

I am currently testing Wordfast Pro, I have read the manual and watched the complete training video, and I still have a few questions. I will try to keep them organized according to subject, and will number them so they are easy to keep track of.

Segmentation: So far, I've been using German as a source language, and my source documents are Word files. Wordfast seems to recognize dates (2. Februar, for example) as it doesn't segment after the period in dates. It has however at one point segmented after a comma, and when I tried to use the "expand" function, I got a message saying that the two segments were not in the same paragraph, which they most certainly were!

1. On what criteria does Wordfast Pro segment texts?
2. Is it possible to modify/add to these criteria?
3. Are segmentation criteria language-specific? Or are the default settings the same for any language?

Parallel corpora: One of my biggest worries about Wordfast Pro is that there does not seem to be a way to import parallel corpora directly into it. It is possible to "create a new TM which can be leveraged later, if a local TM is not available", but the only way to do this seems to be to open a source document and translate it, or copy and paste the translation into the appropriate segments. This is time consuming. In the past, I have used Wordfast Classic and +tools, as well as some other TM and CAT tools. I have extracted source and target texts using Wordfast Classic and then aligned them using +tools, and tried opening the aligned texts (as a word document) in Wordfast Pro, to see if the source text would end up in the source text segments and the target texts in the target text segments. (Wishful thinking, I know). Obviously this failed.

1. Is there any possible way to import or open a source text AND an already-translated target text into Wordfast Pro, in order to make use of word done before one started using Wordfast Pro? If so, please explain how to do this, and what file formats are acceptable.

2. I assume that it is possible to use a Wordfast Classic TM in Wordfast Pro. All those I had saved as Wordfast Classic TMs, however, do not seem to work. Would it be possible for you to explain how I start with, say, an aligned .doc file with +tools and Wordfast Classic, and go to a TM which can then be leveraged by Wordfast Pro?

Glossaries: I normally keep my glossaries as Excel files. In the cases where there is only one spreadsheet, I have saved the Excel files as Tab-delineated files (.txt) and then successfully imported them into Wordfast Pro. In these cases...

1. My Excel glossaries have 5 columns: source term, target term, source context, target context, and comments. Wordfast Pro has imported my excel-txt files, but has cut off the last 2 columns, meaning that I have source term, target term, and under comments there is the target language context. Is there any way to modify the glossary structure in order to include the other information I included in my Excel glossaries?

In those cases where I had Excel files including more than one sheet, which means I could not save them as tab-delineated...

2. What should I save them as? I have tried .xml, and Wordfast Pro was willing to import the file, but told me that there were 0 entries.


And a final question that has nothing to do with the title, and everything to do with pricing...

If I wanted to buy a Wordfast Pro license for several translators working in one office, would I need to buy one for each translator? One for each computer? One for each language? Or one for each office?

Thanks so much for your time, and your help!

Souni


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 15:45
English to Arabic
Segmentation Dec 30, 2009

Dear Souni,

First of all, please make sure you download and install the latest version of Wordfast Professional available at:
http://www.wordfast.com/store_download.html


Souni wrote:

Segmentation: So far, I've been using German as a source language, and my source documents are Word files. Wordfast seems to recognize dates (2. Februar, for example) as it doesn't segment after the period in dates. It has however at one point segmented after a comma, and when I tried to use the "expand" function, I got a message saying that the two segments were not in the same paragraph, which they most certainly were!

1. On what criteria does Wordfast Pro segment texts?
2. Is it possible to modify/add to these criteria?
3. Are segmentation criteria language-specific? Or are the default settings the same for any language?



Wordfast Pro does not segment text after a comma. It segmented it like this because of the paragraph mark. This is the same reason why you could not expand the segment. If you think this is not the case, please send me this (portion of) file.

Currently, end-of-segment punctuation is not user-defined. But you can expand or shrink segments as follows:

To shrink a segment (to split it into two parts):

1- Double-click the source segment to unlock it.
2- Place the cursor at *any* location in the source text.
3- Click on the toolbar button "Shrink Segment".

Note 1: Now, "Shrink Segment" works even if the target text has been modified.

Note 2: In large files, this can take a few seconds. So, be patient; do not click the button twice.


To expand a segment (to merge it with the next segment):

1- Select the first segment; this is enough.
2- Click on the toolbar button "Expand Segment".

Note 1: This works even with already shrunk segments.
Note 2: Segments in two paragraphs cannot be merged.


---
Best regards,
Yasmin Moslem

Wordfast Support Team
www.wordfast.com | www.wordfast.net



[Edited at 2009-12-30 10:57 GMT]


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 15:45
English to Arabic
TMs Dec 30, 2009

Souni wrote:

Parallel corpora: One of my biggest worries about Wordfast Pro is that there does not seem to be a way to import parallel corpora directly into it. It is possible to "create a new TM which can be leveraged later, if a local TM is not available", but the only way to do this seems to be to open a source document and translate it, or copy and paste the translation into the appropriate segments.



This one of WFP features that should be seen soon. A built-in Alignment Tool is being developed.


Souni wrote:

In the past, I have used Wordfast Classic and +tools, as well as some other TM and CAT tools. I have extracted source and target texts using Wordfast Classic and then aligned them using +tools, and tried opening the aligned texts (as a word document) in Wordfast Pro, to see if the source text would end up in the source text segments and the target texts in the target text segments. (Wishful thinking, I know). Obviously this failed.




What you need to do is to complete the process to create a TM using +Tools from those aligned documents. Then, you will be able to use such TM in Wordfast Pro. I hope it is clear.


Souni wrote:

1. Is there any possible way to import or open a source text AND an already-translated target text into Wordfast Pro, in order to make use of word done before one started using Wordfast Pro? If so, please explain how to do this, and what file formats are acceptable.




Please see above!


Souni wrote:


2. I assume that it is possible to use a Wordfast Classic TM in Wordfast Pro. All those I had saved as Wordfast Classic TMs, however, do not seem to work. Would it be possible for you to explain how I start with, say, an aligned .doc file with +tools and Wordfast Classic, and go to a TM which can then be leveraged by Wordfast Pro?



It should work without any problem.

As you know, the TM is a text file (.txt). If you have a Wordfast Classic TM (or a +Tools aligned TM), you can simply open it Wordfast Pro. Here are the instructions:

- Make sure you create/open a project with the suitable language variants.
- Go to "Translation Memory" menu > Select/New TM > "Open"
- Locate the TM text file
- OK

That is it.


If you have problems with that, please let me know what those are exactly, what you tried and what you faced.




---
Best regards,
Yasmin Moslem

Wordfast Support Team
www.wordfast.com | www.wordfast.net


Direct link Reply with quote
 

Yasmin Moslem  Identity Verified
Egypt
Local time: 15:45
English to Arabic
Excel - License Dec 30, 2009

Souni wrote:

In those cases where I had Excel files including more than one sheet, which means I could not save them as tab-delineated...




In MS Excel, make sure you are on the required sheet before using "Save as". You will be able to have a tab-delimited text file glossary for each sheet. You can then import all those glossaries to one WFP glossary.


Souni wrote:

If I wanted to buy a Wordfast Pro license for several translators working in one office, would I need to buy one for each translator? One for each computer? One for each language? Or one for each office?



One for each computer.


If you have any further questions, please let me know.

---
Best regards,
Yasmin Moslem

Wordfast Support Team
www.wordfast.com | www.wordfast.net


Direct link Reply with quote
 
Souni
Local time: 15:45
German to English
+ ...
TOPIC STARTER
Many thanks Dec 30, 2009

That was the most satisfying and useful answer I've ever gotten on a forum!

Thank you very much for this complete and quick response,

Souni


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Segmentation, parallel corpora, and importing TMs and Glossaries

Advanced search


Translation news related to Wordfast





Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search