Combining segments Thread poster: James McVay
| James McVay United States Local time: 23:39 Russian to English + ...
I'm translating an article where almost all of the first names of people mentioned are abbreviated as an initial followed by a period. Example: H. Karzai. OmegaT ends the segment after the period, thus splitting the sentence into two parts. Is there a way I can force combination of two (or more) segments? I realize OmegaT allows me to establish segmentation rules for common abbreviations, but that won't help in this instance. | | | Didier Briel France Local time: 05:39 English to French + ... Segmentation rules would help | May 4, 2010 |
James McVay wrote: I'm translating an article where almost all of the first names of people mentioned are abbreviated as an initial followed by a period. Example: H. Karzai. OmegaT ends the segment after the period, thus splitting the sentence into two parts. Is there a way I can force combination of two (or more) segments? A simple solution is to replace the space (between H. and Karzai) by a non-breaking space in the source document. Thus the split will not occur. I realize OmegaT allows me to establish segmentation rules for common abbreviations, but that won't help in this instance. It would, if you create a rule that prevent breaking: Before: [A-Z]\. After: \s (The Segmentation/Exception check box must *not* be checked.) Put that as the first rule. Didier | | | James McVay United States Local time: 23:39 Russian to English + ... TOPIC STARTER Learn something new every day | May 6, 2010 |
Thanks, Didier. I tried it and it works. Then I wrote rules for another few abbreviations that come up repeatedly in Russian. | | | Samuel Murray Netherlands Local time: 05:39 Member (2006) English to Afrikaans + ... Two methods (and a third) | May 6, 2010 |
James McVay wrote: Is there a way I can force combination of two (or more) segments? Allow me to say what Didier said. You can (a) edit the source text to remove the offending fullstops or to change the space into a non-breaking space, and then reload the project, or (b) add a segmentation rule. Didier's segmentation rule is very good. The regular expressions used by OmegaT are these: http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html But, you can also add rules on a per-case basis. For example, if your text contains "W. Churchill" then you can add that specific name to the segmentation rules. Normally it would be cumbersome to do it (OmegaT's GUI just wasn't designed for quick rule adding), so if you're just a little computer literate, you can try my little script for adding exceptions to the segmentation rules: http://leuce.com/tempfile/omtautoit/segadder.zip | |
|
|
I don't make it | Jun 15, 2010 |
My problem this time is that they use each header twice, and only the second time they want it translated. So if a header is like "Options", the source goes: Options Options and the target should be: Options Alternativ I've made one exception by adding Options to the "before" in the segmentation rules exceptions, with "after" being empty, and also one by adding Options to the "after", with the "before" being empty. And one with "Opt... See more My problem this time is that they use each header twice, and only the second time they want it translated. So if a header is like "Options", the source goes: Options Options and the target should be: Options Alternativ I've made one exception by adding Options to the "before" in the segmentation rules exceptions, with "after" being empty, and also one by adding Options to the "after", with the "before" being empty. And one with "Options" in both before and after (although there is technically a Return in there as well, which might or might not matter). Still it makes the two "Options" two different segments, thereby forcing me to have the same translation for them... ▲ Collapse | | | Samuel Murray Netherlands Local time: 05:39 Member (2006) English to Afrikaans + ... The segmentaion rules trick works only... | Jun 15, 2010 |
Harklas wrote: Still it makes the two "Options" two different segments, thereby forcing me to have the same translation for them... The segmentation rules trick works only if the repeating segment is in a paragraph with other sentences. If your segment is in its own paragraph, no segmentation rule will force it to "join" another segment. | | | so then I guess | Jun 15, 2010 |
the only solution is to manually edit the finished translations... | | | There is no moderator assigned specifically to this forum. To report site rules violations or get help, please contact site staff » Combining segments TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |