Working from FRA/ITA: How to handle terms with apostrophes?
Thread poster: Rox-Edling

Rox-Edling
Germany
Local time: 01:15
Member (2007)
Spanish to German
+ ...
Mar 12

For all those working from French or Italian (or any other languages with frequent apostrophes) CT has an unpleasant disadvantage: It does not spot the following kind of terms:

• Terms with several words behind a straight apostrophe are not recognized
• Terms with one word behind a curly apostrophe are not recognized
• Terms with several words behind a curly apostrophe are not recognized

The only terms that are being recognized are one word terms behind a straight apostrophe when "Prefix matching" is enabled – but even then a very good part remains unspotted. And "Prefix matching" has the disadvantage of much noise – a "Terms consistency check" is rather impossible with that setting.

I was told to enter the terms with the word before the apostrophe, e.g. "l'expérience" instead of "expérience". I do not think this is viable, as you need to enter all possible combinations of letter (in most cases 2 or 3) and apostrophes (2 or 3). The other use case is the work with a client glossary, where this kind of "workaround" is no solution (if you do not want to rework the whole glossary ...).

In the CT forum I did not get any valuable hint. Perhaps another user here has a practicable hint how to master this problem.


 

Igor Kmitowski  Identity Verified
Poland
Local time: 01:15
Member (2016)
English to Polish
+ ...
Prefix matching Mar 12

If you wish to increase the fuzziness of your glossary terms, please turn on the Prefix matching option for your glossary. Alternatively, if you wish to maintain the exactness in your glossary, add the exact terms to your glossary as they are.

 

Rox-Edling
Germany
Local time: 01:15
Member (2007)
Spanish to German
+ ...
TOPIC STARTER
... but ... Mar 12

Thanks for your answer, but what is unexact about such terms as
• expérience
• humanité
• automobile
(and there are thousands more)?

And as I wrote above, the problem is that even with fuzzy matching on, not all are spotted ...


 

Igor Kmitowski  Identity Verified
Poland
Local time: 01:15
Member (2016)
English to Polish
+ ...
Regular expression pipe shortcut Mar 12

You might try catching the roots of the terms via regular expression pipe shortcut which is as simple as:

|expérience
|humanité
|automobile

However, this works with the Prefix matching option on for the glossary so it might not be the best solution for you.

[Edited at 2018-03-12 19:30 GMT]


 

Rox-Edling
Germany
Local time: 01:15
Member (2007)
Spanish to German
+ ...
TOPIC STARTER
... not at all ... Mar 13

Thanks for the hint, but it does not work, even not with "Prefix matching" activated, at least not for the more common curly apostrophes.

 

Rox-Edling
Germany
Local time: 01:15
Member (2007)
Spanish to German
+ ...
TOPIC STARTER
... perhaps a workaround ... Mar 14

Hans Lenting pointed me to a possible solution:

• Add the corresponding characters to the field "Additional space characters (Unicode)" under Preferences > Memory, e.g. "U+006CU+2019" fpr "l’".
• now the terms are being recognized

What are the disadvantages of this provisional workarounds?
• the festure mentioned above is not documented at all
• it will most probably have a negative impact on match values (and even on TM content?)
• there are three most probable letters, with the most probable apostrophes (straight and curly) there are at least six more entries in a relative small field. Taking every possible combination of letters (the less common e.g. "m’", "t’", etc.) and apostrophes seems nearly impossible
• I am unsure if this workaround really fetches all concerned terms

So it cannot be but a very provisional workaround. From the postings above I see there won't be another solution. That's a pity.


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Not usable Mar 14

Rox-Edling wrote:

• Add the corresponding characters to the field "Additional space characters (Unicode)" under Preferences > Memory, e.g. "U+006CU+2019" fpr "l’".


The suggested approach is not usable. The recognition of the incarnations of the apostrophe as a word delimiter should be fixed in the software.

Without changing the current good behaviour and performance (of term recognition) of course.


 

Rox-Edling
Germany
Local time: 01:15
Member (2007)
Spanish to German
+ ...
TOPIC STARTER
Fixed! Mar 15

The issue has been fixed with the newest update. Thanks!

 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Congratulations Mar 16

Rox-Edling wrote:

The issue has been fixed with the newest update. Thanks!


I know for a fact that you got grey hair over this issue. Glad that is has been resolved!

Thank you for your persistent asking and giving background information. Much appreciated!

[Edited at 2018-03-16 07:26 GMT]


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Working from FRA/ITA: How to handle terms with apostrophes?

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search