Add new segmentation rules (SRX)
Thread poster: Jessicaliu
Jessicaliu  Identity Verified
Hong Kong
Local time: 06:56
Chinese to English
+ ...
Sep 12, 2016

I need to add two new segmentation rules to deal with some repetitive expressions. For example,

For further details please refer to the section "A. Industry Information" in this manual. For further details please refer to the section "1. Company Information" in this manual.

These two sentences are segmented into four segments which ended with a dot. But I need the tool to ignore these dots.

The rules I wrote didn't work. Please help me check where is wrong?

rule break="no"
beforebreak
afterbreak


\d\.
\s+\p{Ll}

[\p{Lu}]\.
[\p{Z}]\[\p{Lu}]

^\s*[0-9]+\.
\s+\p{Ll}

None of these rules works. (Since some XML tags are not premitted, I omit half brackets.)


[Edited at 2016-09-12 07:09 GMT]

[Edited at 2016-09-12 07:12 GMT]

[Edited at 2016-09-12 08:44 GMT]


Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 05:56
Member (2004)
English to Thai
+ ...
Similar segments Sep 12, 2016

Jessicaliu wrote:


I need to add two new segmentation rules to deal with some repetitive expressions. For example,

For further details please refer to the section "A. Industry Information" in this manual. For further details please refer to the section "1. Company Information" in this manual.

These two sentences are segmented into four segments which ended with a dot. But I need the tool to ignore these dots.



If I were you, I set segmentation rules as is since dot is essential for most segments. Similar segment hits with fuzzy concordances, though.

Soonthon L.


Direct link Reply with quote
 
Jessicaliu  Identity Verified
Hong Kong
Local time: 06:56
Chinese to English
+ ...
TOPIC STARTER
thanks Sep 13, 2016

[/quote]

If I were you, I set segmentation rules as is since dot is essential for most segments. Similar segment hits with fuzzy concordances, though.

Soonthon L. [/quote]

Thanks Soonthon. But the expressions I mentioned are highly repetitive in the current and future project. I do need new rules to deal with them.


Direct link Reply with quote
 

Nora Diaz  Identity Verified
Mexico
Local time: 16:56
Member (2002)
English to Spanish
+ ...
Would a segmentation exception help? Sep 13, 2016

It sounds like an exception to the full stop rule, rather than a new rule, is what you need. Have a look at this article, maybe it will help:

http://noradiaz.blogspot.mx/2015/12/segmentation-exceptions-in-sdl-trados.html

Best,

Nora Díaz


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Add new segmentation rules (SRX)

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search