https://www.proz.com/forum/across_support/115829-fuzzy_word_match_in_concordance_search.html

Fuzzy word match in concordance search
Thread poster: TrM Translations
TrM Translations
TrM Translations
Hungary
Local time: 03:31
English to Hungarian
+ ...
Sep 19, 2008

Hello,

I am new to Across, and I can't figure out how to perform a fuzzy search in the concordance... i.e. searching for the term "concordance" should also find the entries containing "concordances" and vice versa. Trados does this in its concordance search function. Rigth now all I have in my across is precise match.

I saw that the across termbase system (crossterm) does allow for wildcards (*), and I'd need something similar for the concordance search.

Ve
... See more
Hello,

I am new to Across, and I can't figure out how to perform a fuzzy search in the concordance... i.e. searching for the term "concordance" should also find the entries containing "concordances" and vice versa. Trados does this in its concordance search function. Rigth now all I have in my across is precise match.

I saw that the across termbase system (crossterm) does allow for wildcards (*), and I'd need something similar for the concordance search.

Version: Personal Edition 4.00 SP1c_EN Package version 4

Thanks for the help.

Istvan FULOP
TRM Translations
Hungary
Collapse


 
Katarzyna Slowikova
Katarzyna Slowikova  Identity Verified
Germany
Local time: 03:31
English to Czech
+ ...
Same here! Jun 20, 2014

I have the very same problem in the latest version of Across Language Server. I have been using it for a year or so now and it's been there from the beginning (the updates are downloaded automatically).
Only 100% matches are ever found, which makes concordance very difficult to use in flexive languages.
To be sure, I have the percentage set to the lowest value, 50%.
Is this a normal performance or what?
I really hope to get an answer here!
Katarzyna


 
Milan Condak
Milan Condak  Identity Verified
Local time: 03:31
English to Czech
Old manual on fuzzy terminology recognition Jun 20, 2014

TrM Hungarian Translations wrote:

I saw that the across termbase system (crossterm) does allow for wildcards (*)


I am Wordfast trainer and I hope the across feature is similar to WFC.
The old manual can be useful.

My first question: who enter terminology into glossary?
My second question: are in glossary at word ending asterisks = *
My advice: put asteriks into termbase.

Here is more info:

Wordfast Classic manual (2004, by Yves Champollion) on propagate, asterisks (wildcard), fuzzy terminology recognition, stripping and stemming:


PropagateWhole

If arecognised single term ends with a wildcard, the whole word is replaced, rather than just its root. Thus, if the glossary has affect* = affecter and the source text has affection, the final result will be affecter rather than affection.

Terminology format

Terms can use upper and/or lower case. Avoid unnecessary characters like brackets, quotes, slashes, dashes etc unless absolutely necessary. The * wildcard can be used at the end of a term, if different forms of a term are possible (this is called MFTR and described below). Here is a sample english-french glossary:

Maintenance*
Entretien*

Interview*
Entrevue*

minimum wage*
salaire* minim*

Do not place the * wildcard less than four characters from the beginning of an entry. So, pa* the bill* is not valid; use three entries like pay the bill*, pays the bill* and payed the bill*.

During a translation session, press Shift+Ctrl+G to load glossaries into a toolbar drop-down list for better visibility. Outside sessions, use Ctrl+Alt+Left/Right to display/hide the glossary lists. Note that glossaries of more than 5,000 entries, or more than 200 Kbytes, cannot be loaded into a toolbar drop-down list. But when looking up terms, Wordfast will load the term, plus 50 terms before and after the found term, for reference. These large glossaries can nevertheless be used for all other operations: QC, terminology recognition, etc. They are fully opened and editable using the glossary editor (the icon after the glossary drop-down list).

This is where AFTR really helps, and yields best results. Once the job is completed, and you have a spare hour, you may consider integrating client terminology into one of your existing glossaries, and manually add asterisks like:

two-way multiplexed autoresponder*

double furnace boiler*

dichotomic search*

DOS-based application*

This way, your homegrown glossary runs on MFTR rather than AFTR.

Two PB (Pandora's Box) commands can be used to fine-tune AFTR: GloStemmingRule and GloStems.

The essence of AFTR is to determine what is a word's stem by gradually stripping letters from the word's end. Note that we deal here with statistics - there are exceptions to this rule, and every language has its requirements. The verb go, for example, will change into went in the past tense, thereby defeating any AFTR attempt. By chance, client terminology is primarily made of technical words and expressions, where nouns outnumber verbs by a clear margin, thereby minimizing the problem of verbs and their changing roots.


Milan


 
Katarzyna Slowikova
Katarzyna Slowikova  Identity Verified
Germany
Local time: 03:31
English to Czech
+ ...
Asterisks do not work for the concordance search Jun 23, 2014

or more precisely, they're ignored: "kužel*" will find only occurrences with "kužel", as if the asterisk was not there.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Fuzzy word match in concordance search






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »