Fuzzy word match in concordance search
Thread poster: TrM Translations

TrM Translations
Local time: 20:37
Member (2007)
English to Hungarian
+ ...
Sep 19, 2008


I am new to Across, and I can't figure out how to perform a fuzzy search in the concordance... i.e. searching for the term "concordance" should also find the entries containing "concordances" and vice versa. Trados does this in its concordance search function. Rigth now all I have in my across is precise match.

I saw that the across termbase system (crossterm) does allow for wildcards (*), and I'd need something similar for the concordance search.

Version: Personal Edition 4.00 SP1c_EN Package version 4

Thanks for the help.

Istvan FULOP
TRM Translations


Katarzyna Slowikova  Identity Verified
Local time: 20:37
English to Czech
+ ...
Same here! Jun 20, 2014

I have the very same problem in the latest version of Across Language Server. I have been using it for a year or so now and it's been there from the beginning (the updates are downloaded automatically).
Only 100% matches are ever found, which makes concordance very difficult to use in flexive languages.
To be sure, I have the percentage set to the lowest value, 50%.
Is this a normal performance or what?
I really hope to get an answer here!


Milan Condak  Identity Verified
Local time: 20:37
English to Czech
Old manual on fuzzy terminology recognition Jun 20, 2014

TrM Hungarian Translations wrote:

I saw that the across termbase system (crossterm) does allow for wildcards (*)

I am Wordfast trainer and I hope the across feature is similar to WFC.
The old manual can be useful.

My first question: who enter terminology into glossary?
My second question: are in glossary at word ending asterisks = *
My advice: put asteriks into termbase.

Here is more info:

Wordfast Classic manual (2004, by Yves Champollion) on propagate, asterisks (wildcard), fuzzy terminology recognition, stripping and stemming:


If arecognised single term ends with a wildcard, the whole word is replaced, rather than just its root. Thus, if the glossary has affect* = affecter and the source text has affection, the final result will be affecter rather than affection.

Terminology format

Terms can use upper and/or lower case. Avoid unnecessary characters like brackets, quotes, slashes, dashes etc unless absolutely necessary. The * wildcard can be used at the end of a term, if different forms of a term are possible (this is called MFTR and described below). Here is a sample english-french glossary:



minimum wage*
salaire* minim*

Do not place the * wildcard less than four characters from the beginning of an entry. So, pa* the bill* is not valid; use three entries like pay the bill*, pays the bill* and payed the bill*.

During a translation session, press Shift+Ctrl+G to load glossaries into a toolbar drop-down list for better visibility. Outside sessions, use Ctrl+Alt+Left/Right to display/hide the glossary lists. Note that glossaries of more than 5,000 entries, or more than 200 Kbytes, cannot be loaded into a toolbar drop-down list. But when looking up terms, Wordfast will load the term, plus 50 terms before and after the found term, for reference. These large glossaries can nevertheless be used for all other operations: QC, terminology recognition, etc. They are fully opened and editable using the glossary editor (the icon after the glossary drop-down list).

This is where AFTR really helps, and yields best results. Once the job is completed, and you have a spare hour, you may consider integrating client terminology into one of your existing glossaries, and manually add asterisks like:

two-way multiplexed autoresponder*

double furnace boiler*

dichotomic search*

DOS-based application*

This way, your homegrown glossary runs on MFTR rather than AFTR.

Two PB (Pandora's Box) commands can be used to fine-tune AFTR: GloStemmingRule and GloStems.

The essence of AFTR is to determine what is a word's stem by gradually stripping letters from the word's end. Note that we deal here with statistics - there are exceptions to this rule, and every language has its requirements. The verb go, for example, will change into went in the past tense, thereby defeating any AFTR attempt. By chance, client terminology is primarily made of technical words and expressions, where nouns outnumber verbs by a clear margin, thereby minimizing the problem of verbs and their changing roots.



Katarzyna Slowikova  Identity Verified
Local time: 20:37
English to Czech
+ ...
Asterisks do not work for the concordance search Jun 23, 2014

or more precisely, they're ignored: "kužel*" will find only occurrences with "kužel", as if the asterisk was not there.


To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

Fuzzy word match in concordance search

Advanced search

SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search