Defining your own segmentation rules for Chinese source for CAT tools ?
Thread poster: PatentTrans

PatentTrans
United States
Local time: 04:38
Chinese to English
Oct 31, 2013

Anyone tried defining your own segmentation rules for CAT software, Chinese being the source? I'm using punctuation marks to break up paragraphs and it's not bad. For one of my documents (patent) it showed about 1/3 - 1/2 of the segments as being unique. Is it possible to optimize this further? Of course if the segments are too short then I'll run into readability issues. Chinese grammar is kind of chaotic and I'm having a tough time finding a reliable pattern.

Direct link Reply with quote
 

Frank Lin  Identity Verified
China
Local time: 18:38
English to Chinese
+ ...
be careful doing this Nov 2, 2013

You can define some new splitting rules, according to the 1/3 - 1/2 unique content in your patent document.

But changing the type of segmentation considerably changes the way a CAT tool works and, among other things, may also influence the alignment of translations, pre-translations, etc. You should avoid repeatedly changing the type of segmentation for a document format, because this will otherwise have a negative impact on the quality of the translation memory.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Defining your own segmentation rules for Chinese source for CAT tools ?

Advanced search






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search