Add new segmentation rules (SRX)
Thread poster: Jessicaliu
Jessicaliu  Identity Verified
Hong Kong
Local time: 20:20
Chinese to English
+ ...
Sep 12, 2016

I need to add two new segmentation rules to deal with some repetitive expressions. For example,

For further details please refer to the section "A. Industry Information" in this manual. For further details please refer to the section "1. Company Information" in this manual.

These two sentences are segmented into four segments which ended with a dot. But I need the tool to ignore these dots.

The rules I wrote didn't work. Please help me check where is wrong?

rule break="no"
beforebreak
afterbreak


\d\.
\s+\p{Ll}

[\p{Lu}]\.
[\p{Z}]\[\p{Lu}]

^\s*[0-9]+\.
\s+\p{Ll}

None of these rules works. (Since some XML tags are not premitted, I omit half brackets.)


[Edited at 2016-09-12 07:09 GMT]

[Edited at 2016-09-12 07:12 GMT]

[Edited at 2016-09-12 08:44 GMT]


Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 19:20
Member (2004)
English to Thai
+ ...
Similar segments Sep 12, 2016

Jessicaliu wrote:


I need to add two new segmentation rules to deal with some repetitive expressions. For example,

For further details please refer to the section "A. Industry Information" in this manual. For further details please refer to the section "1. Company Information" in this manual.

These two sentences are segmented into four segments which ended with a dot. But I need the tool to ignore these dots.



If I were you, I set segmentation rules as is since dot is essential for most segments. Similar segment hits with fuzzy concordances, though.

Soonthon L.


Direct link Reply with quote
 
Jessicaliu  Identity Verified
Hong Kong
Local time: 20:20
Chinese to English
+ ...
TOPIC STARTER
thanks Sep 13, 2016

[/quote]

If I were you, I set segmentation rules as is since dot is essential for most segments. Similar segment hits with fuzzy concordances, though.

Soonthon L. [/quote]

Thanks Soonthon. But the expressions I mentioned are highly repetitive in the current and future project. I do need new rules to deal with them.


Direct link Reply with quote
 

Nora Diaz  Identity Verified
Mexico
Local time: 05:20
Member (2002)
English to Spanish
+ ...
Would a segmentation exception help? Sep 13, 2016

It sounds like an exception to the full stop rule, rather than a new rule, is what you need. Have a look at this article, maybe it will help:

http://noradiaz.blogspot.mx/2015/12/segmentation-exceptions-in-sdl-trados.html

Best,

Nora Díaz


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Add new segmentation rules (SRX)

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search