Pages in topic: [1 2] > | TU not found in Studio 2011 Thread poster: Jonathan Hopkins
|
Has anyone ever experienced the case of Studio not producing a fuzzy match for an existing TU? Here's an example of what I mean: I have a segment with the following text: "Office Suites Antibakterielle Microban-Fußstütze" In my TM I have a nearly perfect match: "Office Suites Antibakterielle Microban Fußstütze" As you can see the only difference is the hyphen between Microban and Fußstütze. However, Studio can't find it in the Editor (Ctrl+shift+t). Only ... See more | | | Erik Freitag Germany Local time: 19:16 Member (2006) Dutch to German + ... Known problem | Nov 23, 2011 |
Jonathan, This is a behaviour I know quite well. The outcome of quite some detective work with support staff was that in my case this has to do with the way project TMs are populated from the master TM. To my surprise, staff informed me that the use of project TMs should generally be avoided, unless absolutely needed for collaboration with other translators - this is a piece of information I'd like to see in the software documentation, but as there actually even isn't anythin... See more Jonathan, This is a behaviour I know quite well. The outcome of quite some detective work with support staff was that in my case this has to do with the way project TMs are populated from the master TM. To my surprise, staff informed me that the use of project TMs should generally be avoided, unless absolutely needed for collaboration with other translators - this is a piece of information I'd like to see in the software documentation, but as there actually even isn't anything like that ... The problem indeed seems to be gone since I don't use project TMs anymore. So - do you use project TMs? If yes - try without. Kind regards, Erik
[Bearbeitet am 2011-11-23 17:07 GMT]
[Bearbeitet am 2011-11-23 17:07 GMT] ▲ Collapse | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER I wouldn't want to work without the project TM | Nov 23, 2011 |
Hi Erik, Thanks for your reply. efreitag wrote: So - do you use project TMs? If yes - try without. Since Studio doesn't have an automatic backup or save function to save the sldxliff files, I'd rather not do without a project TM, since that is the only real back up that I have. Or at least to my knowledge, master TMs are not automatically updated upon confirming a segment (Ctrl+Enter), only project TMs are. (Or is this merely a setting that I could change, so that the Master TMs are always automatically updated after confirming a segment?) Don't you worry that you may lose work, should Studio crash? How do you secure your work? Cheers, Jon | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER No change after disabling the project TM | Nov 23, 2011 |
Hi Erik, I disabled the Project TM (and incidentally it asked me if I wanted to update the master TM, so that answered my earlier question), but that didn't give me the desired result, unfortunately. I've since logged a support case with SDL. If they provide me with anything useful I'll pass it on. Cheers, Jonathan | |
|
|
Erik Freitag Germany Local time: 19:16 Member (2006) Dutch to German + ... Match value issue? | Nov 23, 2011 |
Jonathan, Ok, next thing to consider ist that the problem you're experiencing might be based on another well-known bug (or, depending on the point of view: feature) of Trados, at least since 2009: What a human perceives as a near-100% match (say, 99%), often is only a 70% match for Studio (I'm making up the numbers, obviously). Studio's match algorithm is quite useless. For us humans, there's only a hyphen missing, for Studio, there are two separate words (Microban Fuß... See more Jonathan, Ok, next thing to consider ist that the problem you're experiencing might be based on another well-known bug (or, depending on the point of view: feature) of Trados, at least since 2009: What a human perceives as a near-100% match (say, 99%), often is only a 70% match for Studio (I'm making up the numbers, obviously). Studio's match algorithm is quite useless. For us humans, there's only a hyphen missing, for Studio, there are two separate words (Microban Fußstütze) in the TM, while there's only one long word (Microban-Fußstütze) in your text. As far as Studio's match algorithm is concerned, both are completely unrelated. You might want to play with your fuzzy match value settings a bit: If you set them low enough, your TM segment might be proposed as a fuzzy match - the trade-off being that you'll get a lot of noise then. Kind regards, Erik Edit: Links to earlier discussions: http://glg.proz.com/forum/sdl_trados_support/183991-how_to_have_trados_concentrate_on_relevant_text_rather_than_tag_material_for_finding_matches.html http://www.proz.com/forum/cat_tools_technical_help/196156-match_algorithm_expectations.html
[Bearbeitet am 2011-11-23 17:58 GMT]
[Bearbeitet am 2011-11-23 18:00 GMT] ▲ Collapse | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER No change even for 30% threshold | Nov 23, 2011 |
Hi Erik, I actually already responded to this message several hours ago, but it would appear that something is amiss and the post never appeared. Hence this message. efreitag wrote: You might want to play with your fuzzy match value settings a bit: If you set them low enough, your TM segment might be proposed as a fuzzy match ... This was the first experiment I tried. I set the threshold down to the lowest possible value (30%) and received a hit (53%), which was nothing like the segment in question. Unfortunately, the TU that is exactly the same, save the hyphen, still wasn't found by Studio. Btw, thanks for the links to the other threads. Cheers, Jonathan | | | Some matching algorithms are more "human" than others... | Nov 24, 2011 |
efreitag wrote: Ok, next thing to consider ist that the problem you're experiencing might be based on another well-known bug (or, depending on the point of view: feature) of Trados, at least since 2009: What a human perceives as a near-100% match (say, 99%), often is only a 70% match for Studio (I'm making up the numbers, obviously). Studio's match algorithm is quite useless. FWIW, here is how some other tools would have rated Jonathan's fuzzy match: 1) TWB 8.3: 97% 2) Wordfast Classic 6: 87% 3) memoQ 5: 68% | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER
Ok, so now I changed not only the threshold down a few notches, but I also checked the box, "Search both project and main translation memories", and that produced the appropriate TU. Even still, I'd expect a sentence that is a perfect match save one character to be much closer to 100% i.e. 47 matching characters divided by total number of characters 48 ... See more Ok, so now I changed not only the threshold down a few notches, but I also checked the box, "Search both project and main translation memories", and that produced the appropriate TU. Even still, I'd expect a sentence that is a perfect match save one character to be much closer to 100% i.e. 47 matching characters divided by total number of characters 48 = ca. 98% match. Cheers, Jonathan ▲ Collapse | |
|
|
Anne Bohy France Local time: 19:16 English to French May be related to particular characters in the sentence | Dec 1, 2011 |
I have just experienced a similar problem. I was trying to retrieve 100% matches from a TM (English to French) and discovered that some were found, some not. I realized quite quickly that the sentences which didn't work were those containing contractions (aren't, isn't, etc.), that is, all sentences containing a single quote. Changing the project settings as you indicated, to search both project and main translation memories helped. HOWEVER, I still see that IDENTICAL TUs are not con... See more I have just experienced a similar problem. I was trying to retrieve 100% matches from a TM (English to French) and discovered that some were found, some not. I realized quite quickly that the sentences which didn't work were those containing contractions (aren't, isn't, etc.), that is, all sentences containing a single quote. Changing the project settings as you indicated, to search both project and main translation memories helped. HOWEVER, I still see that IDENTICAL TUs are not considered 100% matches when there are quotes in them... For instance the word "they're" is striked and replaced by "they'" (in front) and "re" (behind)... Because of this, the match rate is down to 92%. Obviously, there are two pieces of Studio 2011 code which do not handle words the same way ! In my opinion the hyphenation sign that you have in your TUs may lead to the same problem. I wonder if some piece of code suppresses hyphens and concatenates the strings before and after the hyphen? Try to see if you can find the concatenated word (with no hyphen) in your TM. ▲ Collapse | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER I wouldn't consider those exactly 100% matches either | Dec 1, 2011 |
Hi Bohy, bohy wrote: I have just experienced a similar problem. I was trying to retrieve 100% matches from a TM (English to French) and discovered that some were found, some not. I realized quite quickly that the sentences which didn't work were those containing contractions (aren't, isn't, etc.), that is, all sentences containing a single quote. So, were the 92% matches not found at all? Or do you just mean that instead of a 100% match, Studio gave you 92% matches? bohy wrote: HOWEVER, I still see that IDENTICAL TUs are not considered 100% matches when there are quotes in them... For instance the word "they're" is striked and replaced by "they'" (in front) and "re" (behind)... Because of this, the match rate is down to 92%. I wouldn't have expected Studio to consider these as 100% matches either, and depending on how many characters are in the source segment (e.g. if you only have 12 characters, a difference of only one character could justifiably decrease the value to roughly 92%. However, I think Studio's algorithms work on a word-matching basis, and therefore if simply one character is different, Studio considers the entire word a mismatch (even though there is just an apostrphe or hyphen). That would be the reason why the fuzzy value seems way off, especially for short segments. Bohy wrote: I wonder if some piece of code suppresses hyphens and concatenates the strings before and after the hyphen? Try to see if you can find the concatenated word (with no hyphen) in your TM. Please see above. I've already included screen shots showing the word in a concordance search and via the TM view. Cheers, Jonathan | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER
So, if you have a short sentence and the old source segment differs from the new source segment by nothing more than a hyphen, you may not get a match at all, but if your sentence only has a few words, lets say 4, and two of them happen to match at the same position you may get an incredibly high match So "Zipper compartment in lid" (old segment) is a ... See more So, if you have a short sentence and the old source segment differs from the new source segment by nothing more than a hyphen, you may not get a match at all, but if your sentence only has a few words, lets say 4, and two of them happen to match at the same position you may get an incredibly high match So "Zipper compartment in lid" (old segment) is a 73% match of "On-screen display (OSD) menu". And for the simple fact that the preposition and definite article match at the same location "... auf dem..." (on the). I think this example is good at showing the fallacy of weighting the position of words so highly. Especially for small sentences, it is entirely possible that the language structure and word order will often be similar and prepositions and articles will be in similar places as in the example above, so why weight it so high? On the other hand, if I have a segment like: Gel Handgelenkauflage and then Gel-Handgelenkauflage or Gelhandgelenkauflage Neither of the latter two spellings will match the first segment at all. 0% match But if you happen to be translating some kind of point form list in a ppt file, you'll get all kinds of nonsense just because there are a similar number of words in the segment with the position of a colon and one two-letter word being in the same place: *sigh* ▲ Collapse | | | That's why I changed back to Trados 2007 | Dec 1, 2011 |
I experienced the same behaviour as you, Jonathan, and since - because of its new matching algorithm - Trados Studio gives me too little leverage of my TMs, I changed back to Trados 2007. To me, investing in Studio 2009 was useless. I posted examples similar to yours elsewhere in this forum, for instance: New segment: Condensate pump (A and B) TM hit 1: Reference plot (A and B) [74%] TM hit 3: Condensate pump A [62%] | |
|
|
Anne Bohy France Local time: 19:16 English to French Eureka: different characters | Dec 1, 2011 |
I tried to reproduce the problem with a simple testcase. After many unsuccessful attemps, I finally realized what happened. The text in my translation memory was actually coming from an Excel file. The new text that I wanted to translate was a Word file. The problem is that Word and Excel don't use the same character for the apostrophe: when you hit the single quote key, Excel produces a (vertical) single quote and Word produces an apostrophe (slanted or curly, depending ... See more I tried to reproduce the problem with a simple testcase. After many unsuccessful attemps, I finally realized what happened. The text in my translation memory was actually coming from an Excel file. The new text that I wanted to translate was a Word file. The problem is that Word and Excel don't use the same character for the apostrophe: when you hit the single quote key, Excel produces a (vertical) single quote and Word produces an apostrophe (slanted or curly, depending on the language context). The same may happen with your dash. Have you checked that it is the same dash? There are short ones, and longer ones... The dash is considered as a character inside a word, so changing the type of dash makes the whole compound word appear as different... Although there is an explanation to this strange behavior, it is something that we would like SDL to address in a rational way! ▲ Collapse | | | Jonathan Hopkins Germany Local time: 19:16 German to English + ... TOPIC STARTER Thanks for those examples | Dec 1, 2011 |
Hello Matthias, Dr. Matthias Schauen wrote: I experienced the same behaviour as you, Jonathan, and since - because of its new matching algorithm - Trados Studio gives me too little leverage of my TMs, I changed back to Trados 2007. To me, investing in Studio 2009 was useless. I posted examples similar to yours elsewhere in this forum, for instance: New segment: Condensate pump (A and B) TM hit 1: Reference plot (A and B) [74%] TM hit 3: Condensate pump A [62%] Thanks for those examples. I've never used older versions of Trados. My experience with CAT tools dates back only roughly 4 years and is limited to a few trial versions of dejavu, wordfast classic, memoQ, Swordfish and some others. The only prioprietary tools that I've used (and used the most extensively) are Transit and now Studio. I like the environment of Studio and really appreciate a lot of its features (AutoSuggest incl. the method for inserting dictionary entries to name but one or two advantages). I just wish it would fix some of these really annoying issues. Surely, the developers at SDL must admit that something is amiss here. Has Paul or anyone else from SDL given a response to complaints of this kind in other threads? Kind regards, Jonathan | | | Responses? Yes, but... | Dec 1, 2011 |
Jonathan Hopkins wrote: Has Paul or anyone else from SDL given a response to complaints of this kind in other threads? What I found here from people working for or associated with SDL in response to reports of this problem goes all in the same direction: We have a single algorithm which is optimized to deliver appropriate scores in most situations and this is more likely to be the reason why most users don't complain about it. ...a need to understand specific cases and how best to use the software to suit your needs as you can only cater for the majority of situations with the default settings. Wrong or right - the whole world is imperfect | | | Pages in topic: [1 2] > | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » TU not found in Studio 2011 Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |