Trados Studio Abbreviation Lists and Segmentation
Thread poster: Tom Fennell

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
Dec 6, 2009

My Russian abbreviation lists are not working in Trados Studio.

The activated TMs all have the Russian abbreviation lists correctly displayed in the language resources/abbreviation list field.

Russian Language Resource Template also has the list.

The problem seems to be in the Segmentation Rules Dialog box.

I added added a rule "Abbreviation."

In the edit segmentation Rules dialogue box,

I check the box "check abbreviations."

I then hit OK, and OK again on the Segmentation Rules dialog box.

When I go back by hitting edit on the segmentation rules, then edit on "abbreviation," it turns out that:

"the check abbreviations" box is not checked.

Help!


[Edited at 2009-12-06 22:55 GMT]

[Edited at 2009-12-06 22:57 GMT]

[Edited at 2009-12-06 22:58 GMT]

[Edited at 2009-12-06 22:59 GMT]

[Edited at 2009-12-06 23:42 GMT]

[Edited at 2009-12-07 03:14 GMT]


Direct link Reply with quote
 

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
TOPIC STARTER
Update: SDL called me today Dec 10, 2009

SDL called and hooked up to my computer.

They don't understand it - think it may have something to do with Vista / User Access Control issues (even though I have UAC disabled).

They are consulting their developers and promised to get back to me tomorrow.


Direct link Reply with quote
 

Richard Hall  Identity Verified
United States
Local time: 19:54
Italian to English
+ ...
please keep us informed Dec 11, 2009

I'm following this topic with interest. I hope you keep us informed.

Direct link Reply with quote
 

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
TOPIC STARTER
Abbreviations recognition is very important Dec 11, 2009

Richard Hall wrote:

I'm following this topic with interest. I hope you keep us informed.


I had a file that had to have a third of the segments merged due to one unrecognized abbreviation.

This caused the file/tags to become corrupted and I lost may hours restoring it. (Pre-translation of the source was also hindered by the segmentation problem).

I'll keep the community informed!


Direct link Reply with quote
 

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
TOPIC STARTER
OK. Now SDL is concerned. Dec 11, 2009

After the "Vista" approach, today SDL said it might because of legacy memories.

However, after exploring further we pinpointed the problem. The "check abbreviations" checkbox in the "Edit Segmentation Rules" dialog box works just fine when English is the source language, but does NOT work when Russian (Ukranian, Japanese, etc. ) is the source. Western European languages seem fine.

Maybe its an ansi/unicode issue?

They were able to recreate the problem on their own machines...now they agreed their developers need to get cracking.


Direct link Reply with quote
 

Richard Hall  Identity Verified
United States
Local time: 19:54
Italian to English
+ ...
Wow: what an oversight.... Dec 11, 2009

...I'm lost for words...

Direct link Reply with quote
 

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
TOPIC STARTER
I'm shocked at our colleagues, too Dec 11, 2009

Richard Hall wrote:

...I'm lost for words...


I'm rather shocked no one seems to have brought it up until now.....


Direct link Reply with quote
 

Tom Fennell
United States
Local time: 18:54
Member (2010)
Russian to English
+ ...
TOPIC STARTER
Still no word from SDL Dec 17, 2009

Although this would seem be a major failure effecting proper segmentation in all languages that do not use the Latin alphabet.

I have to invest at least 30 minutes a day on properly re-segmenting my files. Multiply that by the number of Trados users in non-Latin alphabets.....

And all this work needs to be re-done if there is a file corruption resulting in the need to restore a file form memory.

I'm rather astounded that they are not in "urgent mode."


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 01:54
English
Explanation Dec 22, 2009

Hi,

Tom has kindly been helping us to resolve this problem, and we believe we have fixed it. An explanation of the problem is as follows.

Most languages get their segmentation rules through the language group they are in, and therefore there are only a couple of generic (non-language specific) segmentation rules sets defined by Studio. The generic rules are just called SegmentationRules.xml whilst the language specific ones have a code added to the start, for example CentralEurope_SegmentationRules.xml or WesternEurope_SegmentationRules.xml. Some rule sets are generic and will be applied for a set of languages, while others are language-specific and will only be used for that specific language such as JA_SegmentationRules.xml.

The problem is not that all non-latin languages failed to segment based on the abbreviation rules. It is only those languages that are covered by the generic SegmentationRules.xml as this didn't contain a full stop rule, and consequently didn't contain an abbreviation exception either. Whether or not abbreviations were defined for a language using the generic rules didn't matter.

The affected languages are these;

Azeri (Cyrillic) - Azerbaijan
Azeri (Latin) - Azerbaijan
Belarusian - Belarus
Bulgarian - Bulgaria
Bengali (India)
Bosnian (Cyrillic, Bosnia and Herzegovina)
Divehi (Maldives)
Greek (Greece)
Persian
Frisian (Netherlands)
Gujarati (India)
Hebrew (Israel)‎
Hindi (India)
Armenian (Armenia)
Georgian (Georgia)
Kazakh (Kazakhstan)
Kannada (India)
Konkani (India)
Kyrgyz (Kyrgyzstan)
Macedonian (Former Yugoslav Republic of Macedonia)
Malayalam (India)
Mongolian (Cyrillic, Mongolia)
Marathi (India)
Nepali (Nepal)
Punjabi (India)
Pashto (Afghanistan)
Romansh (Switzerland)
Russian (Russia)
Sanskrit (India)
Serbian (Cyrillic, Bosnia and Herzegovina)
Serbian (Cyrillic, Serbia)
Serbian (Latin, Serbia)
Syriac (Syria)‎
Tamil (India)
Telugu (India)
Turkish (Turkey)
Tatar (Russia)
Ukrainian (Ukraine)
Urdu (Islamic Republic of Pakistan)‎
Uzbek (Cyrillic, Uzbekistan)
Uzbek (Latin, Uzbekistan)
Vietnamese (Vietnam)


We have completed some initial tests with a fix using a new resources file for these segmentation rules and this seems to have resolved the problem for all new TM's. This means that any users who have this problem will have to create a new TM and then import the contents of their old TM into the new one after replacing the affected resource file.

The fix for this will be available in the new year as it needs to be tested properly, but if anyone is having problems, as Tom was, as a result of the additional work this could cause in having to merge many segments that have incorrectly broken where abbreviations were used then please let us know through this forum and we will be happy to make the fix available to you as a beta until we release this properly.

Regards

Paul


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Trados Studio Abbreviation Lists and Segmentation

Advanced search







PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search