Pages in topic:   [1 2] >
The Lord of the Rings
Thread poster: Grzegorz Gryc

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
May 17, 2012

Hi

Can anybody explain me the SDL logic which makes the sentence like "The Lord of the Ring" is proposed as a 73% match for "The Name of the Rose" in Trados Studio 2011?

Cheers
GG


Direct link Reply with quote
 

Tony.J.A.@DT  Identity Verified
United States
Local time: 18:47
English to French
+ ...
funny but not that illogical May 17, 2012

Grzegorz Gryc wrote:

Hi

Can anybody explain me the SDL logic which makes the sentence like "The Lord of the Ring" is proposed as a 73% match for "The Name of the Rose" in Trados Studio 2011?

Cheers
GG


When you look at [The ... of the ...] you do get ~73% lol


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
TOPIC STARTER
The things are getting worse... May 17, 2012

Tony.J.A.@DT wrote:

Grzegorz Gryc wrote:

Can anybody explain me the SDL logic which makes the sentence like "The Lord of the Ring" is proposed as a 73% match for "The Name of the Rose" in Trados Studio 2011?


When you look at [The ... of the ...] you do get ~73% lol


Nope...
Mathematically speaking, if every word in a 5 word sentence stands for 20%, it should be 60% like the "classic" Trados.
Really, it's some very fuzzy logic here...

Cheers
GG


Direct link Reply with quote
 

neilmac  Identity Verified
Spain
Local time: 00:47
Spanish to English
+ ...
Plurality May 18, 2012

Isn't it "Lord of the Rings"?

And anyway, the day I need a machine to translate that kind of of thing, I'll hang up my translator hat.


Direct link Reply with quote
 

Erik Freitag  Identity Verified
Germany
Local time: 00:47
Member (2006)
Dutch to German
+ ...
Beside the point May 18, 2012

neilmac wrote:
And anyway, the day I need a machine to translate that kind of of thing, I'll hang up my translator hat.


I think this is beside the point. I'm sure Grzegorz has made up these words as a simple, but revealing example of how utterly useless Studio's matching algorithm frequently is.



[Bearbeitet am 2012-05-18 11:55 GMT]

[Bearbeitet am 2012-05-18 12:16 GMT]

[Bearbeitet am 2012-05-18 12:16 GMT]


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 00:47
English
position May 18, 2012

It doesn't just take into account the words, but also the position of the words in the segments. As the order of "The ... of the ..." is 100% in both sentences the same, it increases the match rate.

And quite honestly:

"The Lord of the Rings"
translates to:
"Der Herr der Ringe"
And:
"The Name of the Rose"
translates to:
"Der Name der Rose"

Which is quite easy to translate and does not need the translator to change the structure of the sentence just two words (obviously might be different for other languages, where you need to change the structure but still).

That would easily warrant a 73% in my book, as it's not that difficult a sentence to translate.

Other than that, the longer the segments are, the more "correct" the match rates become. For small sentences, it's very difficult to find a good match rate, unless you start "humanising" the engine (which we have MT for...)

Cheers,
Luis


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
TOPIC STARTER
Dig deeper... May 18, 2012

SDL Support wrote:

It doesn't just take into account the words, but also the position of the words in the segments. As the order of "The ... of the ..." is 100% in both sentences the same, it increases the match rate.

And quite honestly:

"The Lord of the Rings"
translates to:
"Der Herr der Ringe"

Here (PL),"Władca pierścieni".

And:
"The Name of the Rose"
translates to:
"Der Name der Rose"

Here (PL),"Imię róży".

No common point at all.
Select "smarter" language pairs.

Which is quite easy to translate and does not need the translator to change the structure of the sentence just two words (obviously might be different for other languages, where you need to change the structure but still).

As above.
The structure related info is almost useless in my language.
Start to think using some trickier assumptions

That would easily warrant a 73% in my book, as it's not that difficult a sentence to translate.

A good algorithm should be at least partially reversible.
I.e., here, as a human, I expect something like 20-25% in the EN-PL pair i.e. below the Trados minimum threshold (30%).
Because for PL-EN, the example above is obviously 0% according to Trados.

Other than that, the longer the segments are, the more "correct" the match rates become.

I liked a lot your "more correct".
I.e. less aberrant, as I understood?

For small sentences, it's very difficult to find a good match rate, unless you start "humanising" the engine (which we have MT for...)

Nope.
Just use the basic assumptions of the information theory i.e. start to apply sound weights to frequent/shorter words.

Cheers
GG


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
TOPIC STARTER
Typo... May 18, 2012

efreitag wrote:

neilmac wrote:
And anyway, the day I need a machine to translate that kind of of thing, I'll hang up my translator hat.


I think this is beside the point. I'm sure Grzegorz has made up these words as a simple, but revealing example of how utterly useless Studio's matching algorithm frequently is.


Exactly.

And sorry for the flagrant typo in the message text.
Of course, "The Lord of the Rings".
At least, I wrote it correctly in the title...

PS
I have more funny examples like that...

Cheers
GG


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
Would it not be too much work to implement such solution? May 18, 2012

Grzegorz Gryc wrote:

For small sentences, it's very difficult to find a good match rate, unless you start "humanising" the engine (which we have MT for...)

Nope.
Just use the basic assumptions of the information theory i.e. start to apply sound weights to frequent/shorter words.

Cheers
GG


ExtraTerm used to have a list of "stop words" for different languages. I guess these lists might be used for this kind of weighting that you are suggesting.

I am not sure, though, what would be the cost of implementing and maintaining a language-dependent solution like this one and whether the cost of implementing it would outweigh the benefits (more accurate fuzzy matching for short phrases).

In the projects in which I have worked, I don't recall ever having a large number of short phrases without a verb but with repeated syntactical structures where this miscalculation of the fuzzy matches would be relevant for the final quote but our mileage might vary.

I guess that if you are translating a product catalog or some other similar list of components, this issue might be relevant. I wonder how other CAT tools deal with this type so phrases.

I am not sure how many SDL customers would need to have this sorted instead of other issues or features if it is not going to be a problem for them.

Daniel


Direct link Reply with quote
 

Pavel Tsvetkov  Identity Verified
Bulgaria
Local time: 01:47
Member (2008)
English to Bulgarian
+ ...

MODERATOR
However... May 19, 2012

These movies are alike in a sense. Both are very popular and among my favorites, so maybe Trados takes into account the cultural context, and more specifically – mine.

Thanks for the entertaining post, Grzegorz.


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
TOPIC STARTER
2 of 5 May 19, 2012

Pavel Tsvetkov wrote:

These movies are alike in a sense. Both are very popular and among my favorites, so maybe Trados takes into account the cultural context, and more specifically – mine..


Trados also takes in account the religious context
"In the Name of Father" is also a 73% match for "In the Garden of Eden".

I.e. the error is perfectly reproducible in the pattern "2 words of 5 doesn't match".
The 73% score is aberrant.

Cheers
GG


Direct link Reply with quote
 

Alexandre Maricato
Brazil
Local time: 21:47
Member (2009)
English to Portuguese
Trados 2011 Popcorn May 20, 2012

Maybe Trados is not going to the movies lately.

Tip for the devolpers: IMDB plugin

[Edited at 2012-05-21 06:41 GMT]


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 00:47
French to Polish
+ ...
TOPIC STARTER
How to pay less for translation... May 20, 2012

Daniel García wrote:

ExtraTerm used to have a list of "stop words" for different languages. I guess these lists might be used for this kind of weighting that you are suggesting.

I am not sure, though, what would be the cost of implementing and maintaining a language-dependent solution like this one and whether the cost of implementing it would outweigh the benefits (more accurate fuzzy matching for short phrases).

Indeed, the language specific solutions would be effective.
But it can be done even without stopwords and language dependent solutions.
Trados Studio should simply not use stupid coefficients intended to raise artificially the fuzziness level and pay the translators less.

In the projects in which I have worked, I don't recall ever having a large number of short phrases without a verb but with repeated syntactical structures where this miscalculation of the fuzzy matches would be relevant for the final quote but our mileage might vary.

I guess that if you are translating a product catalog or some other similar list of components, this issue might be relevant.

It is.
Especially if you consider the absurd weight Trados applies to numerals (and generally placeables) and tags.
E.g. in Studio "300 Spartans" is a 65% match for "3 Musketeers"
If you add some tags, dashes, ponctuation, you'll have a match above the default 70% Trados threshold.
It's absurd.
This algorithm is bad.

I wonder how other CAT tools deal with this type so phrases.

Basically, most of them simply use a "flat" algorithm where the weight of words is identical and the weight of the numerals is close to 0 (in fact, the numerals are replaced/inserted automatically by most CAT tools...).
E.g. for DVX, Swordfish, Cafetran, Wordfast Pro etc., "300 Spartans" would never be suggested for "3 Musketeers" even if you apply a very low match threshold.
memoQ has a bug in the algorithm for 2 word sentences, so it would match 'em with a 64% score.

If 2 words in a 5 words sentence differ, a tool using this kind of flat algorithm would show 60% match which is obviously bad for a human in a sentence pair like "The Lord of the Rings" but the 60% score can be easily explained and understood, the simplistic word counting has its obvious limitations but it's logical.
The 73% in Trados Studio is simply crazy.

I am not sure how many SDL customers would need to have this sorted instead of other issues or features if it is not going to be a problem for them.

The problem is the same kind of error exist also in larger sentences but it's less visible.
E.g., if one word changes in a 10 words sentence, most CAT tools (including the old Trados) will show a 90% fuzzy match.
For Studio 2011, it's a 94% match.
Of course, the people are happy they receive this kind of matches and don't complain but they don't realize Studio overestimates matches and underestimates the money the translators earn.

Cheers
GG

[Edited at 2012-05-20 08:10 GMT]


Direct link Reply with quote
 
Stefan Pecen  Identity Verified
Local time: 00:47
Member (2006)
English to Slovak
+ ...
It is a logical consequence Jun 20, 2012

of the fact sdl is not only a software vendor, but also a translation service provider
so it is in their interest to lower the wordcounts to buy cheaper


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 00:47
English
We also employ... Jun 20, 2012

Stephen wrote:

of the fact sdl is not only a software vendor, but also a translation service provider
so it is in their interest to lower the wordcounts to buy cheaper



... a lot of inhouse translators so how does this apply here? Do we make sure we pay ourselves less too?

Paul


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

The Lord of the Rings

Advanced search







SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search