It is a critical business secret.
A fuzzy match is one in which the sentence retrieved from the Translation Memory is not identical but only similar to the one currently being translated. The percentage you see is calculated by taking into account how many words differ, how the embedded codes differ, and also the order words are in. The percentage is not something accurate (except from a very specific point of view) and should only be used as an indicator to the translator who must then evaluate these matches and accept or reject them on the basis of his knowledge and understanding.
The actual method for calculating these percentages, which is very closely related to the way the searches are done, I cannot reveal.
A fuzzy match is a non-exact match for a TU source segment (present in the memory) as compared to a document's source segment (the segment we want to translate). The algorithm is very complex. On longer sentences, the program calculates the precentage of words that are found in the two segments.
Words that begin with the same letters (like "Frau" and Frauen") but which are different will of course be counted as being present in both sentences, but of course, with some degree of penalties based on how many letters differ, and of course with some threshold limits.
Fuzzy-match algorithms are not really trade secrets, but it would takes pages and pages to describe them. What matters, essentially, is that the similarity is rated in a percentage way, based on words (exact or simlar) rather than letters, and most fuzzy-match algorithms are essentially the same with only very minor differences. I real life, translators care for anything above 80% and *really* care for 99 and 100%. Anything lower, practically, needs attention and re-translation anyway. Bickering over finer fuzziness points, which is very language-dependent, is an activity I would not spend too much time on
It is a business secret only revealed to big clients who need it to optimize their work flow.
Still no accurate definition found...
[Edited at 2004-11-04 12:02]