Pages in topic:   [1 2 3] >
What do you use concordance search for?
Thread poster: Dominique Pivard

Dominique Pivard  Identity Verified
Local time: 22:50
Finnish to French
Feb 22, 2013

I have a question to Studio users: do you use concordance search primarily for searching (single) words or phrases (expressions)? and do you expect it to always find words that you know are in the TM?

The reason I'm asking is a recent article published in SDL's knowledge base, which says:
Concordance Search is NOT a full-text search through your complete translation memory, rather it is a fuzzy search that has been designed to return meaningful results in large data sets as quickly as possible.
and:
Although you can use Concordance Search for words and characters, the algorithm is optimized for searching phrases.

I personally use concordance search primarily for single words, and I find it very disturbing if a CAT tool fails to find occurrences of these words in the TM if there are some. I mostly search the Finnish side of my TM, which means that words can occur in a variety of forms (because of declensions and other suffixes).


 

Grzegorz Gryc  Identity Verified
Local time: 21:50
French to Polish
+ ...
Both... Feb 22, 2013

Dominique Pivard wrote:

I have a question to Studio users: do you use concordance search primarily for searching (single) words or phrases (expressions)?

Both

and do you expect it to always find words that you know are in the TM?

Yes.

Cheers
GG


 

Bernard Lieber  Identity Verified
Local time: 21:50
English to French
+ ...
Both as well Feb 22, 2013

I find that concordance search is not that great in Studio 2009/11 - lacks a number of features available in other CAT tools. 2007 displays a concordance popup when the appropriate box is checked no longer available in 2009/11.

Other CAT tools display portions or even let you highlight the relevant expression(s) and automatically copy them into the segment.

Bernard


 

SDL Community  Identity Verified
United Kingdom
Local time: 21:50
English
Terminology? Feb 22, 2013

Hi Dominique

I created that article and honestly it took me some time to understand all the settings you can set and what different results you can achieve. Also the questions "why" and "why not" came up several times. Thats why I created that article and make sure that customers know the difference between a Full-text search through the entire TM and a Concordance Search that is similiar to a Fuzzy Search.

Actually it does not fail to find occurrences of a word you are searching with the Concordance Search.

The concordance search allows you to search the term/word in their different contexts within their phrases. I think it would make here more sense to search for phrases where your word is included since as you say it might have different translations based on the context of the phrase.

Just imagine you have a Translation Memory that contains 100.000 ( I think this size is nowadays normal for a translator?!) and your word/term that you search is within 500 segments. Would you really want to wait until the search went through your complete TM to get 500 results that you need to go through to find your possible translation that is used in a specific context?

At this point we would have thousands of customers who would complain primarily about the performance issues or surely also the question "why do I receive so many useless results".


I agree there are possible enhancement that leaves you the option for example
- Search through the complete TM when performing Concordance Search
or
- The possibility to increase the Concordance Search hits

These are two things that came through my mind when I created the article and also understand the "concordance search universe" during my testing. These two things were passed to our development (so you do not need to worry that we are not aware of this).



I mostly search the Finnish side of my TM, which means that words can occur in a variety of forms (because of declensions and other suffixes).


MAybe an alternative:
If you just want to look-up for different translations that exist for specific terms would it here not make more sense that you create a termbase for specific terms where you can add a variety of synonyms and get automatically possible translation results from the TermRecognition that you can see and directly see without the need to use the concordance search?


Many thanks
Richard Puschmann | Support Engineer | SDL | Language Technologies Division |

Have a problem with our support? Maybe you already can find an answer in our Knowledge Base here: http://kb.sdl.com

Here you find solutions to known issues and many more....

If you want to keep up-2-date with the latest released articles you can also follow us on Twitter here: https://twitter.com/SDLSupport


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:50
Member (2006)
English to Afrikaans
+ ...
What I use concordance search for (as a non-Trados user) Feb 22, 2013

SDL Support wrote:
If you just want to look-up for different translations that exist for specific terms would it here not make more sense that you create a termbase for specific terms...


Huh... what??

icon_smile.gif

As a non-Trados user, I use the concordance feature of my CAT tool(s) specifically to save me from having to create termbases manually. When I do find a term in a concordance search, I do often add it to the glossary, but it aint in my glossary at the time when I do the concordance search to begin with.

My apologies if my comment relates to something that Trados can't do.


 

nrichy (X)
France
Local time: 21:50
French to Dutch
+ ...
Single words or small expressions Feb 22, 2013

I use Concordance extensively to search in the TM or in aligned material (in fact, for me it's the only reason to align something). I search for words, parts of words or small expressions which come back regularly or even titles (2-5 words).
I do expect it to find all occurrences - but if there are lots of occurrences, the first 10 or so will do, so that I can refine my search.
If there are no 100% occurrences, I expect it to find fuzzies.
As an old WFC user, I decide then if I enter this term with two clicks into the glossary, so that it can autopropagate, or not.
I am a beginner with Studio, and after 6 months I still fail to see in which way I can obtain a similar useful process.


 

Dominique Pivard  Identity Verified
Local time: 22:50
Finnish to French
TOPIC STARTER
doesn't concordance search in Studio rely on an index? Feb 23, 2013

SDL Support wrote:
Actually it does not fail to find occurrences of a word you are searching with the Concordance Search.

Unfortunately, it is not so with Finnish words: concordance search will miss words that are in the TM if you don't enter the exact form found in the TM. Even if you enter the first 18 letters of a word that has 19 letters, Studio will fail to find it.

SDL Support wrote:
The concordance search allows you to search the term/word in their different contexts within their phrases. I think it would make here more sense to search for phrases where your word is included since as you say it might have different translations based on the context of the phrase.

The problem is that the notion of "phrase search" doesn't really apply to all languages. For instance, in Finnish you would typically have a single word where you would have a preposition + an article + a noun in English.

SDL Support wrote:
Just imagine you have a Translation Memory that contains 100.000 ( I think this size is nowadays normal for a translator?!) and your word/term that you search is within 500 segments. Would you really want to wait until the search went through your complete TM to get 500 results that you need to go through to find your possible translation that is used in a specific context?

But doesn't concordance search rely on an index, like matches at the sentence level? If so, it shouldn't matter whether your TM has 5000, 50.000 or 500.000 units.

SDL Support wrote:
At this point we would have thousands of customers who would complain primarily about the performance issues or surely also the question "why do I receive so many useless results".

Again, I don't see why searching for concordances should be any different (performance-wise) from searching matches at the segment level.

SDL Support wrote:
I agree there are possible enhancement that leaves you the option for example
- Search through the complete TM when performing Concordance Search
or
- The possibility to increase the Concordance Search hits

Yes, if the user is prepared to pay the price (in the form of longer search times), he could opt-in for these settings.


MAybe an alternative:
If you just want to look-up for different translations that exist for specific terms would it here not make more sense that you create a termbase for specific terms where you can add a variety of synonyms and get automatically possible translation results from the TermRecognition that you can see and directly see without the need to use the concordance search?

Of course, having everything in a termbase would solve the problem, but in practice, concordance search is often a substitute for finding terms that are missing from a termbase.


 

SDL Community  Identity Verified
United Kingdom
Local time: 21:50
English
... Feb 23, 2013

SDL Support wrote:
Actually it does not fail to find occurrences of a word you are searching with the Concordance Search.

Unfortunately, it is not so with Finnish words: concordance search will miss words that are in the TM if you don't enter the exact form found in the TM. Even if you enter the first 18 letters of a word that has 19 letters, Studio will fail to find it.


Have you enabled the Character-based concordance?

SDL Support wrote:
The concordance search allows you to search the term/word in their different contexts within their phrases. I think it would make here more sense to search for phrases where your word is included since as you say it might have different translations based on the context of the phrase.

The problem is that the notion of "phrase search" doesn't really apply to all languages. For instance, in Finnish you would typically have a single word where you would have a preposition + an article + a noun in English.


Then why using a Concordance Search that is optimized for phrase searching and not SDL MultiTerm?

SDL Support wrote:
Just imagine you have a Translation Memory that contains 100.000 ( I think this size is nowadays normal for a translator?!) and your word/term that you search is within 500 segments. Would you really want to wait until the search went through your complete TM to get 500 results that you need to go through to find your possible translation that is used in a specific context?

But doesn't concordance search rely on an index, like matches at the sentence level? If so, it shouldn't matter whether your TM has 5000, 50.000 or 500.000 units.


Then you would need to wait until it goes through your complete TM to find all possible results and score them and list them. As written in the article: "...Concordance Search is NOT a full-text search through your complete translation memory, rather it is a fuzzy search that has been designed to return meaningful results in large data sets as quickly as possible."


SDL Support wrote:
I agree there are possible enhancement that leaves you the option for example
- Search through the complete TM when performing Concordance Search
or
- The possibility to increase the Concordance Search hits

Yes, if the user is prepared to pay the price (in the form of longer search times), he could opt-in for these settings.


Thats why I sent the enhancement requests to developmenticon_smile.gif


Maybe an alternative:
If you just want to look-up for different translations that exist for specific terms would it here not make more sense that you create a termbase for specific terms where you can add a variety of synonyms and get automatically possible translation results from the TermRecognition that you can see and directly see without the need to use the concordance search?

Of course, having everything in a termbase would solve the problem, but in practice, concordance search is often a substitute for finding terms that are missing from a termbase.


Why using a "substitute" when you can use the "real-deal" SDL MultiTerm that is shipped with SDL Trados Studio?

This can save you in longer terms more time and it just takes around 10 seconds to add the term to your termbase and use it with the TermRecognition.

Once in the termbase it will recognize it in the source segment you, you see results in the TermRecognition dialog box and can directly start to type it. Plus if you have AutoSuggest for Termbases activated you type maybe 1, 2 or 3 letters and can choose one of the listed suggestions and press enter.


 

Grzegorz Gryc  Identity Verified
Local time: 21:50
French to Polish
+ ...
Vicious circle... Feb 23, 2013

SDL Support wrote:

(...)

Why using a "substitute" when you can use the "real-deal" SDL MultiTerm that is shipped with SDL Trados Studio?


Because in the real life most people use concordance whey they don't receive hits from termbases...

I.e. if they don't know what they should put in the termbases, they can't use 'em icon_smile.gif

Cheers
GG


 

FarkasAndras
Local time: 21:50
English to Hungarian
+ ...
MT vs concordance Feb 23, 2013

SDL Support wrote:

Then why using a Concordance Search that is optimized for phrase searching and not SDL MultiTerm?



I honestly can't process that. The proposition is absurd. Multiterm cannot conceivably replace concordance search.
Imagine a 10,000 word translation you're doing for a regular client. You're at 8000 words and a problem term comes up. You have a vague recollection that it may have occurred before. Maybe in this job, maybe a previous one you did for the same client earlier. What do you do? You obviously run a concordance search on it, see if it occurred and review the translation(s) you used in the past.
Multiterm is useless unless the term has been added to the termbase. And surely, you don't think that translators add to the TB every single term that occurs in the document? If I did that, I would spend as much time adding terms to the TB as I do translating, for no discernible benefit.

BTW concordance search needs to become a bit more like full text search: it should provide an option for exact term search. I often have to switch to xbench to look things up because the concordance floods me with rubbish hits. I know the exact phrase that occurs in the TM, but I can't get it to show up with a concordance search - I just get a bunch of hits with vaguely similar words in them, often occurring at opposite ends of a 100-word segment. At the very least Trados should have the brains to uprank segments in which the query terms occur close to each other.

[Edited at 2013-02-23 23:06 GMT]


 

Dominique Pivard  Identity Verified
Local time: 22:50
Finnish to French
TOPIC STARTER
MultiTerm not a substitute for concordance Feb 24, 2013

SDL Support wrote:
Have you enabled the Character-based concordance?

Yes, I've tried both (thanks for explaining the differences in the KB article), but it made no difference. Enabling character-based concordance only resulted in bigger TM size, as expected. I could find the word I was looking for by doing a full-text search in the TM view, but it was much slower (again, as expected).
SDL Support wrote:
Then why using a Concordance Search that is optimized for phrase searching and not SDL MultiTerm?

As Grzegorz and Andras said, a termbase / MultiTerm is not a substitute for concordance search. A typical case is when a client submits both a text to be translated and a TM that goes with it (with translations done by someone else), but no termbase. Ideally, a termbase would also be supplied, but in the real world, more often than not, it isn't. Translators usually get paid by word translated, not by term pair added to the termbase. If they have no assurance this won't be a one-off job, they have little incentive to start building a termbase from the already translated material: instead, they will use concordance search as a "quick & dirty" / ad hoc termbase.
SDL Support wrote:
Then you would need to wait until it goes through your complete TM to find all possible results and score them and list them. As written in the article: "...Concordance Search is NOT a full-text search through your complete translation memory, rather it is a fuzzy search that has been designed to return meaningful results in large data sets as quickly as possible."

Ok, now I (think I) got it: it's not finding hits that takes time, it's scoring them. To be frank, having hits scored by the software is not of much interest to me: if I only have a page-full of hits, my human eyes will do the scoring in a matter of seconds. This is especially easy with the way Trados / Studio present concordance search results (compared to many other tools). It's a bit like with Google: if I get too many hits, I will refine my search until I get less of them, after which there's no need for scoring. A solution to your performance problem would be to allow users to skip the scoring step, giving them raw, unsorted hits. Some tools sort hits by date (newer hits first): there should be next to no performance penalty doing so. Newer hits are often more meaningful than older ones, so it's not a bad way to "score" them.
SDL Support wrote:
Thats why I sent the enhancement requests to developmenticon_smile.gif

OK! Generally speaking, I think it would make sense to interview translators from the real world and ask them how they actually use concordance search and what they expect from it.
SDL Support wrote:
Why using a "substitute" when you can use the "real-deal" SDL MultiTerm that is shipped with SDL Trados Studio?
This can save you in longer terms more time and it just takes around 10 seconds to add the term to your termbase and use it with the TermRecognition.

Once in the termbase it will recognize it in the source segment you, you see results in the TermRecognition dialog box and can directly start to type it. Plus if you have AutoSuggest for Termbases activated you type maybe 1, 2 or 3 letters and can choose one of the listed suggestions and press enter.

Grzegorz, Andras and me already answered that: time for a reality check, welcome to the real world of freelance translators!


 

trhanslator (X)
Saving every term to the glossary Feb 24, 2013

FarkasAndras wrote:

Multiterm is useless unless the term has been added to the termbase. And surely, you don't think that translators add to the TB every single term that occurs in the document? If I did that, I would spend as much time adding terms to the TB as I do translating, for no discernible benefit.


I save every term to the glossary. For this I've selected a CAT tool that makes term adding on the fly extremely simple: it handles trailing and leading punctuation marks and numbers.

The time I invest in adding one or more terms in a segment pays back in the next segments.


 

trhanslator (X)
I'm getting paid for adding terms Feb 24, 2013

Dominique Pivard wrote:
Translators usually get paid by word translated, not by term pair added to the termbase. If they have no assurance this won't be a one-off job, they have little incentive to start building a termbase from the already translated material: instead, they will use concordance search as a "quick & dirty" / ad hoc termbase.


I add all terms and get paid by increased productivity in the rest of the translation. I guess, it all depends on how easy it is to quickly add terms on the fly.


 

George Hopkins
Local time: 21:50
Swedish to English
Clients' TMs... Feb 24, 2013

...should be taken with a pinch of salt. They are not always reliable.

Basic hint
When using Concordance, search for the basic form of a word. Eg, bucket and not buckets -- and you will get examples of both bucket and buckets included in the memory.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 21:50
Member (2006)
English to Afrikaans
+ ...
What does term display look like in Studio? Feb 24, 2013

trhanslator wrote:
FarkasAndras wrote:
And surely, you don't think that translators add to the TB every single term that occurs in the document?

I save every term to the glossary.


Well, the design of my CAT tool means that if I add every term to the glossary(ies), every active segment will light up like a christmas tree owing to all the term matches. But this makes me wonder (if perhaps Dominique or Paul (?) would are willing to post a screenshot) what a segment with matched terms look like in Trados 2007 or in Studio.

I have another question (also off-topic), namely: is a glossary and a term base the same thing? I mean, I always hear Trados users talk about their term base, and so I wonder if this is some special thing that is different from an ordinary glossary. Or are they the same thing?


 
Pages in topic:   [1 2 3] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What do you use concordance search for?

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search