Choosing a fuzzy match threshold
Thread poster: Nicole Martin

Nicole Martin
Local time: 12:22
German to English
Aug 2, 2007

I did a forum search and found a similar poll discussion, but I would like some more opinions. I use Trados, but this question applies to anyone who uses CAT tools.

I know of an agency who sets their fuzzy match threshold for every project at 85%. I heard the reason they give is that anything below 85% has enough differences that it would take longer to modify the fuzzy match than it would to just retranslate the sentence. My first question is, do you think that's true? I understand this is a pretty broad question, but do you think, in general, that matches below 85% take so much longer that they're not worth using?

I was thinking, the higher you set your threshold, the more words are counted as no matches - and the client pays more for a no match than a fuzzy. I was trying to figure out if this 85% thing based more on profit or efficiency.

Secondly, I was wondering what some of you fellow ProZ-ers set your fuzzy match threshold at, and why. I have been playing around with my level lately, and when I got Trados installed on my new computer last week, I left it at the default 70%. I had it at 85% before, but now I am noticing projects are going a little faster, and at 70% I haven't found anything so radically different from the sentence to translate that it's entirely useless. But I work mostly with technical repair manuals, very precise and repetitive, so maybe that's why a lower match rate works so well.

Also, even if a fuzzy match needs more than minor changes, I would think it at least helps with consistency. You could see how terms in the sentence had been translated previously without taking the time for a concordance search. I would think if you're an agency and you have ongoing work from a client and several different translators translating it (and all sharing the same TM), you would want all the help you could get to ensure consistency.

So what are your opinions?


Direct link Reply with quote
 

Steven Capsuto  Identity Verified
United States
Local time: 12:22
Spanish to English
+ ...
95%, though I understand why some people go lower Aug 2, 2007

Anything below 85% usually requires significant retranslation, unless the segment already in the TM was from a different type of document (e.g., HTML versus Word) in which case the percentage match might be lower due to "format penalties."

Even 85-95% matches can be way off the mark. On jobs where it makes sense to offer discounts for matches (e.g., technical documents with repeated headings, sentences that are often reused, etc.), my approach is that 95%-100% matches are free. Repetitions within the document are also free. No other discounts apply.

[Edited at 2007-08-02 15:07]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:22
Member (2006)
English to Afrikaans
+ ...
Depends... Aug 2, 2007

Nicole Dequin wrote:
I know of an agency who sets their fuzzy match threshold for every project at 85%. I heard the reason they give is that anything below 85% has enough differences that it would take longer to modify the fuzzy match than it would to just retranslate the sentence.


Depens on:
* the length of the sentence
* the number of placeables in the sentence
* whether the translator can see the original source text from the translation memory, and whether his CAT tool tells him where the two items differ
* peculiarities of the source and target language

I was thinking, the higher you set your threshold, the more words are counted as no matches - and the client pays more for a no match than a fuzzy. I was trying to figure out if this 85% thing based more on profit or efficiency.


No, that can't be the reason, because the client would be doing himself an unfavour by setting the threshold higher.

Secondly, I was wondering what some of you fellow ProZ-ers set your fuzzy match threshold at, and why.


Sadly, the developers of Wordfast recently made the terrible, terrible decision to remove the "subfuzzy display" feature, which means Wordfast users are forced to set their fuzzy thresholds lower than previously to get the same number matches as previously. In the old Wordfast, my threshold was 60%, with subfuzzies enabled. In the new Wordfast, my theshold is as low as the program permits.

Also, even if a fuzzy match needs more than minor changes, I would think it at least helps with consistency.


True, but that's what automatic phrase recognition is for (or is this yet another feature Trados doesn't have?).


Direct link Reply with quote
 

Steven Capsuto  Identity Verified
United States
Local time: 12:22
Spanish to English
+ ...
Automatic phrase recognition Aug 3, 2007

Samuel Murray wrote:

that's what automatic phrase recognition is for (or is this yet another feature Trados doesn't have?).


I'm not sure... I don't use Wordfast very much, so I don't know how that works.

In Trados, you can add frequently used phrases to a termbase, and then bring them into the document by clicking a button once they're recognized.


Direct link Reply with quote
 

Nicole Martin
Local time: 12:22
German to English
TOPIC STARTER
Thanks for the responses! Aug 3, 2007

Samuel Murray wrote:
I was thinking, the higher you set your threshold, the more words are counted as no matches - and the client pays more for a no match than a fuzzy. I was trying to figure out if this 85% thing based more on profit or efficiency.


No, that can't be the reason, because the client would be doing himself an unfavour by setting the threshold higher.


Could you elaborate on why they would be doing themselves an unfavor? I can imagine one reason would be a possibility for less consistency with previous translations or maybe ending up with an unnecessarily high word count that could make deadlines difficult. But I was wondering what your ideas were. I'm not saying it's the best translation practice to "force" the level of no matches higher, but I can see how it can increase profit which could be an important (if not overriding) factor for some agencies.

I have to confess I only have experience with Trados and I really don't know about all the intricacies of other CAT tools so I didn't realize the different features of all the tools that can affect how you set a fuzzy match level. For the sake of this discussion, I am referring to Trados, and it does highlight differences between the fuzzy source sentence and the new source sentence when it presents the translator with a fuzzy match. It also tells if the differences are just formatting or placeables as opposed to textual differences.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Choosing a fuzzy match threshold

Advanced search







SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search