How should Russian to English segment length be set up in Wordfast Classic"
Thread poster: Chris Lovelace

Chris Lovelace  Identity Verified
Argentina
Local time: 09:24
Russian to English
+ ...
Nov 26, 2009

I am new to Wordfast, and am trying to get my translation memory to work. The sentences in my document are very long, but I haven't found a length of any segments that actually produces any fuzzy matches, even though there should be...

What do I need to keep in mind as I build a TM for Russian to English in Wordfast Classic so that I get the best leverage for the CAT tool?


 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 17:24
Member (2008)
English to Russian
+ ...
fuzzy match Nov 26, 2009

is a result of statistical processing of a segment... it does not deal with the segment length... it is the share (weighed part) of one segment that resembles (or coincides with) some other segment. Less share size produces higher probability of repetition (coincidence), as it is more likely to get smaller words (or sequences of characters) repeated in a different segment.

Try setting the fuzzy-match threshold at about 55%, i.e. slightly more than a half of the segment repeated. However, thus it will trigger at articles, prepositions and link verbs more often.icon_smile.gif

Play with the settings. You need to get the feeling they act.

[Редактировалось 2009-11-26 08:30 GMT]


 

Chris Lovelace  Identity Verified
Argentina
Local time: 09:24
Russian to English
+ ...
TOPIC STARTER
Thanks! Any other suggestions? Nov 26, 2009

Sergei Leshchinsky wrote:

is a result of statistical processing of a segment... it does not deal with the segment length... it is the share (weighed part) of one segment that resembles (or coincides with) some other segment. Less share size produces higher probability of repetition (coincidence), as it is more likely to get smaller words (or sequences of characters) repeated in a different segment.

Try setting the fuzzy-match threshold at about 55%, i.e. slightly more than a half of the segment repeated. However, thus it will trigger at articles, prepositions and link verbs more often.icon_smile.gif

Play with the settings. You need to get the feeling they act.

[Редактировалось 2009-11-26 08:30 GMT]


Thanks! I wasn't sure (and am still not quite sure, actually) how far I can push those thresholds and still have the program function. I've just reset the threshold to 55%, so I'm looking forward to seeing how well that works.

Are there any other settings (Pandora's Box, for example) that I can adjust to get better performance for Russian to English translation?


 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 17:24
Member (2008)
English to Russian
+ ...
not sure P.B. will help Nov 26, 2009

In Russian, we have prefixes, suffixes, endings, besides, we have 6 cases for nouns, adjectives, pronouns, and participles, multiplied by 3 genders (m, f, n) and multiplied by 2 numbers (sing., pl.)... Moreover, sometimes the root vowels change within a paradigm, so, some words may have several forms of stems...
As you see, the word is much more flexible than in English, hence the probability of coincidence is lower...

All I can recommend is keep fuzzy threshold low and build you TM... A soon and you stuff it with all word forms, you will get more fuzzies. This is Russian.icon_wink.gif


 

Chris Lovelace  Identity Verified
Argentina
Local time: 09:24
Russian to English
+ ...
TOPIC STARTER
Does it help if the segments are shorter? Nov 26, 2009

At present, I have about 300 lines (of long sentences) in the TM, and have gotten only 2 fuzzy matches. There is not a great deal of repetition in the document, but there ARE at least some repeats.

Is the TM just too small at this point, or should I be loading smaller segments in?

Again, I'm not sure if this is a dumb question or not, since this I don't really know the software well.

Also, does it help to use wildcards in the glossary? I seem to be getting mixed results?

EXAMPLE: Should the term be entered as "представление," представлени*," "представлени" to pick up all the inflected forms?

Thanks again for your help!

[Edited at 2009-11-26 15:01 GMT]


 

Sergei Tumanov  Identity Verified
Local time: 17:24
English to Russian
+ ...
I would recommend Nov 26, 2009

to enter short 'chunks' of bigger sentences into WF glossary.
Sometimes it worked for me, where the bigger sentence was an obvious combination of shorter ones.


[Edited at 2009-11-26 15:10 GMT]


 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 17:24
Member (2008)
English to Russian
+ ...
try adding segmentation rules Nov 26, 2009

Add more stop characters. Make sure it breaks segments at colon, semicolon... you may add em-dash (—) as a stop character. Thus you will get smaller chunks... (as Sergei says above) However, you will be able to extend them to the next stop character by pressing ALT+CTRL+PageDown or clip them with ALT+CTRL+PageUpicon_smile.gif

[Редактировалось 2009-11-26 15:25 GMT]


 

Chris Lovelace  Identity Verified
Argentina
Local time: 09:24
Russian to English
+ ...
TOPIC STARTER
Thanks Nov 26, 2009

Sergei Leshchinsky wrote:

Add more stop characters. Make sure it breaks segments at colon, semicolon... you may add em-dash (—) as a stop character. Thus you will get smaller chunks... (as Sergei says above) However, you will be able to extend them to the next stop character by pressing ALT+CTRL+PageDown or clip them with ALT+CTRL+PageUpicon_smile.gif

[Редактировалось 2009-11-26 15:25 GMT]



Thanks! I'll continue experimenting. You've really helped me out a lot. I appreciate it!


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How should Russian to English segment length be set up in Wordfast Classic"

Advanced search


Translation news in Russian Federation





Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search