How should Russian to English segment length be set up in Wordfast Classic"
Thread poster: Chris Lovelace

Chris Lovelace  Identity Verified
Argentina
Local time: 02:07
Russian to English
+ ...
Nov 26, 2009

I am new to Wordfast, and am trying to get my translation memory to work. The sentences in my document are very long, but I haven't found a length of any segments that actually produces any fuzzy matches, even though there should be...

What do I need to keep in mind as I build a TM for Russian to English in Wordfast Classic so that I get the best leverage for the CAT tool?


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 08:07
Member (2008)
English to Russian
+ ...
fuzzy match Nov 26, 2009

is a result of statistical processing of a segment... it does not deal with the segment length... it is the share (weighed part) of one segment that resembles (or coincides with) some other segment. Less share size produces higher probability of repetition (coincidence), as it is more likely to get smaller words (or sequences of characters) repeated in a different segment.

Try setting the fuzzy-match threshold at about 55%, i.e. slightly more than a half of the segment repeated. However, thus it will trigger at articles, prepositions and link verbs more often.

Play with the settings. You need to get the feeling they act.

[Редактировалось 2009-11-26 08:30 GMT]


Direct link Reply with quote
 

Chris Lovelace  Identity Verified
Argentina
Local time: 02:07
Russian to English
+ ...
TOPIC STARTER
Thanks! Any other suggestions? Nov 26, 2009

Sergei Leshchinsky wrote:

is a result of statistical processing of a segment... it does not deal with the segment length... it is the share (weighed part) of one segment that resembles (or coincides with) some other segment. Less share size produces higher probability of repetition (coincidence), as it is more likely to get smaller words (or sequences of characters) repeated in a different segment.

Try setting the fuzzy-match threshold at about 55%, i.e. slightly more than a half of the segment repeated. However, thus it will trigger at articles, prepositions and link verbs more often.

Play with the settings. You need to get the feeling they act.

[Редактировалось 2009-11-26 08:30 GMT]


Thanks! I wasn't sure (and am still not quite sure, actually) how far I can push those thresholds and still have the program function. I've just reset the threshold to 55%, so I'm looking forward to seeing how well that works.

Are there any other settings (Pandora's Box, for example) that I can adjust to get better performance for Russian to English translation?


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 08:07
Member (2008)
English to Russian
+ ...
not sure P.B. will help Nov 26, 2009

In Russian, we have prefixes, suffixes, endings, besides, we have 6 cases for nouns, adjectives, pronouns, and participles, multiplied by 3 genders (m, f, n) and multiplied by 2 numbers (sing., pl.)... Moreover, sometimes the root vowels change within a paradigm, so, some words may have several forms of stems...
As you see, the word is much more flexible than in English, hence the probability of coincidence is lower...

All I can recommend is keep fuzzy threshold low and build you TM... A soon and you stuff it with all word forms, you will get more fuzzies. This is Russian.


Direct link Reply with quote
 

Chris Lovelace  Identity Verified
Argentina
Local time: 02:07
Russian to English
+ ...
TOPIC STARTER
Does it help if the segments are shorter? Nov 26, 2009

At present, I have about 300 lines (of long sentences) in the TM, and have gotten only 2 fuzzy matches. There is not a great deal of repetition in the document, but there ARE at least some repeats.

Is the TM just too small at this point, or should I be loading smaller segments in?

Again, I'm not sure if this is a dumb question or not, since this I don't really know the software well.

Also, does it help to use wildcards in the glossary? I seem to be getting mixed results?

EXAMPLE: Should the term be entered as "представление," представлени*," "представлени" to pick up all the inflected forms?

Thanks again for your help!

[Edited at 2009-11-26 15:01 GMT]


Direct link Reply with quote
 

Sergei Tumanov  Identity Verified
Local time: 08:07
English to Russian
+ ...
I would recommend Nov 26, 2009

to enter short 'chunks' of bigger sentences into WF glossary.
Sometimes it worked for me, where the bigger sentence was an obvious combination of shorter ones.


[Edited at 2009-11-26 15:10 GMT]


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 08:07
Member (2008)
English to Russian
+ ...
try adding segmentation rules Nov 26, 2009

Add more stop characters. Make sure it breaks segments at colon, semicolon... you may add em-dash (—) as a stop character. Thus you will get smaller chunks... (as Sergei says above) However, you will be able to extend them to the next stop character by pressing ALT+CTRL+PageDown or clip them with ALT+CTRL+PageUp

[Редактировалось 2009-11-26 15:25 GMT]


Direct link Reply with quote
 

Chris Lovelace  Identity Verified
Argentina
Local time: 02:07
Russian to English
+ ...
TOPIC STARTER
Thanks Nov 26, 2009

Sergei Leshchinsky wrote:

Add more stop characters. Make sure it breaks segments at colon, semicolon... you may add em-dash (—) as a stop character. Thus you will get smaller chunks... (as Sergei says above) However, you will be able to extend them to the next stop character by pressing ALT+CTRL+PageDown or clip them with ALT+CTRL+PageUp

[Редактировалось 2009-11-26 15:25 GMT]



Thanks! I'll continue experimenting. You've really helped me out a lot. I appreciate it!


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How should Russian to English segment length be set up in Wordfast Classic"

Advanced search


Translation news in Russian Federation





PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search