# How are Trados fuzzy matches defined?

How are Trados fuzzy matches defined?

Harry Bornemann
Mexico
English to German
...
 Nov 2, 2004

Does someone know the exact formula?

For example, if I have a first segment
"one two three" and a second segment
"two three four"
I thought that the second one would be a 66% match, because two thirds of it are contained in the first one.
But Trados tells me that this is only a 34% match, although it is a simple example without formatting, tags etc..

Jerzy Czopik
Germany
Local time: 21:20
Member (2003)
Polish to German
...
 I don't know the exact formula Nov 2, 2004

but in your example not only 1/3 of the segment has changed, but the placement of words has changed too. As Trados does not "understand" the text, it analyses it (mathematical), taking words, abbreviations, declination and position in sentece, formatting, tags and so on into consideration.
I could imagine, that the matches for "one two three" and "two two three" could be something about 66% (four has one letter more than one, so perhaps this is important too).

Regards
Jerzy

Harry Bornemann
Mexico
English to German
...
TOPIC STARTER
 two two three Nov 2, 2004

Thanks Jerzy,

You guessed right: "two two three" gives 67%.

Nevertheless I would still need the definition.
I wonder whether it is patented or even a secret,
although it is the base of our calculations...

Jerzy Czopik
Germany
Local time: 21:20
Member (2003)
Polish to German
...
 Write an email to Nov 2, 2004

support-de // trados.com, maybe they can give you an asnwer.

Regards
Jerzy

Selçuk Budak
Local time: 23:20
English to Turkish
...
 Check "Penalties" Tab Nov 2, 2004

Trados, when determining % fuzzy match, takes the "penalties" conditions you set in "Options/Translation Memory Options"

For example, if you set "formatting differences penalty" to 10,
It substracts this figure from the overall macth figure. Assume that you have two identical strings: "one two three," "one two three" If there is a formatting difference, you would not get 100%, but 90% mathc (assuming that other parameters are set to 0)

So to increase fuzzy match, lower penalty values defined in the said Tab.

h.i.h.

Harry Bornemann
Mexico
English to German
...
TOPIC STARTER
 I need to know it exactly Nov 2, 2004

Thanks Selçuk,

but I need to know the complete definition (formula, algorithm),
so I wrote to Trados support, as Jerzy suggested.

I will post the results,

Harry

Harry Bornemann
Mexico
English to German
...
TOPIC STARTER
 It is a Business Secret Nov 3, 2004

They won't tell me, because it is one of their critical business secrets.
So we will never know what fuzzy match really means.

Brandis (X)
Local time: 21:20
English to German
...
 I think it goes on character basis Nov 3, 2004

HI! Well trados won´t tell us. I think they do on character basis. a = a is 100 % and ab gives 50% each etc., But that is very mathematical.
Brandis

Harry Bornemann
Mexico
English to German
...
TOPIC STARTER
 Different definitions of different vendors Nov 3, 2004

It is a critical business secret.
Déjà Vu:
A fuzzy match is one in which the sentence retrieved from the Translation Memory is not identical but only similar to the one currently being translated. The percentage you see is calculated by taking into account how many words differ, how the embedded codes differ, and also the order words are in. The percentage is not something accurate (except from a very specific point of view) and should only be used as an indicator to the translator who must then evaluate these matches and accept or reject them on the basis of his knowledge and understanding.

The actual method for calculating these percentages, which is very closely related to the way the searches are done, I cannot reveal.
Wordfast:
A fuzzy match is a non-exact match for a TU source segment (present in the memory) as compared to a document's source segment (the segment we want to translate). The algorithm is very complex. On longer sentences, the program calculates the precentage of words that are found in the two segments.
Words that begin with the same letters (like "Frau" and Frauen") but which are different will of course be counted as being present in both sentences, but of course, with some degree of penalties based on how many letters differ, and of course with some threshold limits.

Fuzzy-match algorithms are not really trade secrets, but it would takes pages and pages to describe them. What matters, essentially, is that the similarity is rated in a percentage way, based on words (exact or simlar) rather than letters, and most fuzzy-match algorithms are essentially the same with only very minor differences. I real life, translators care for anything above 80% and *really* care for 99 and 100%. Anything lower, practically, needs attention and re-translation anyway. Bickering over finer fuzziness points, which is very language-dependent, is an activity I would not spend too much time on
Passolo:
It is a business secret only revealed to big clients who need it to optimize their work flow.

Still no accurate definition found...

[Edited at 2004-11-04 12:02]

Klas Törnquist
Local time: 21:20
English to Swedish
...
 Unacceptable Nov 3, 2004

Harry_B wrote:

They won't tell me, because it is one of their critical business secrets.
So we will never know what fuzzy match really means.

I think this is quite unacceptable. Many agencies base their discount demands on Trados analyses. Some even use all the different Trados fuzzy percentages.
IIRC, Trados (or Translationzone) has even published "recommended rate reductions" for fuzzies.
Thus, Trados more or less decides "standard discounts" and still they won't tell us what the fuzziness really is.

Klas

### How are Trados fuzzy matches defined?

