Mobile menu

Mysterious segmentation rules
Thread poster: Heinrich Pesch

Heinrich Pesch  Identity Verified
Finland
Local time: 06:24
Member (2003)
Finnish to German
+ ...
Nov 9, 2006

One thing that always bothers me when working with Wordfast is the strange behaviour it shows when chosing the segmentation point.
I know I can somehow influence the rules by adding items to the abbrivation list and chosing other rules than the recommended options, but why can it not use common sence in the first place?

Just now I had a sentence with "e.g. a fixed ". Wf first segmented after the first full stop, then I press to enlarge the segment, but it stops at the next full stop, after the g., and only after that does it segment the whole sentence.
When I look at the segmentation tools, the option "sentence" is not recommended. Why not? What would happen?
I would like Wf to segment always, when after a full stop comes a space and after that an initial, where the next sentence starts. But very often Wf grabs two or three sentences into one segment. On the other hand it stoopidly segments at places like "312.15.67", which is rather annoying.

What segmentation rules have you tried out and what do you use?

Regards
Heinrich


Direct link Reply with quote
 

Valters Feists  Identity Verified
Latvia
Local time: 06:24
Member (2005)
English to Latvian
+ ...
experiment with finer settings... Nov 9, 2006

I too do use manual expanding of segments quite often. It's not that terrible if you know the keyboard shortcuts (alt-pgdown).
Are you also aware of manual page breaks versus paragraph-end characters, non-breaking spaces versus normal spaces?

I think you can fiddle with the Wf's setup->segs->end-of-segment punctuation + abbreviations. You can enter your own items in the abbreviations and the ESP boxes (have to do it carefully). This could depend on Wf's version though.
A while ago I was looking for a way of making the regular space character to act as a segment delimiter (so that one TU = one word) -- which unfortunately doesn't seem to be possible and I have to resort to an oblique replace-all-and-later-unreplace-all routine.
Apparently there are some things in Wf that you just can't be in control of. :-/

Regards,
Valters Feists
Technical Latvian translator


Direct link Reply with quote
 

Gerard de Noord  Identity Verified
France
Local time: 05:24
Member (2003)
German to Dutch
+ ...
Don't make any special settings Nov 9, 2006

Hi Heinrich,

You shouldn't make any special settings, if you want your segments to be Trados compatible. Full sentences aren't.

When you encounter e.g., select those four characters and push Ctrl+ALt+T to add the abbreviation to the list of abbreviations. The text will be resegmented.

Regards,
Gerard


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 06:24
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
Why Trados compatible? Nov 9, 2006

I rather would it be common sense compatible

If there is a full stop plus a space plus an initial I really would like the segmenting take place there and not two sentences later.
Can anybody explain why sentence segmentation is not recommended?
According to my experience Wf segmentation rules differ from Trados at least in lists, where the items are numbered German fashion.
1.)
2.)
etc.

Brackets are a problem too for Wf. Often I encounter situations, where a ")." is left to the next segment, and I cannot get Wf to segment after it, instead it jumps to and fro too far or too short.

Perhaps the reason for this is the fact that Wf is French, and the French have some strange punctuation rules, if I remember right?

Regards
Heinrich

[Bearbeitet am 2006-11-09 18:53]


Direct link Reply with quote
 

Philippe Etienne  Identity Verified
Spain
Local time: 05:24
Member
English to French
I love it Nov 9, 2006

Heinrich Pesch wrote:

I rather would it be common sense compatible
...


I am afraid the meaning of common sense is somewhat lost nowadays...
Thanks for the laugh
Philippe


Direct link Reply with quote
 

Valters Feists  Identity Verified
Latvia
Local time: 06:24
Member (2005)
English to Latvian
+ ...
Wf 3.35 - more or less the following settings... Nov 10, 2006

In setup/segs:

1) Add e.g. to the list of abbreviations.
Your list can be for example "Inc.,Corp.,Ltd.,e.g." (separate with commas)
2) You can leave . (full stop) in the ESP box,
3)...but make sure you uncheck "An ESP without a trailing space ends a segment",
4) uncheck also "An ESP + a space + a lowercase end a segment".

P.S.
In French punctuation, a space character comes before exclamation and question marks; it also separates quote marks from words, e.g., « merci ! » .

Regards,
Valters Feists
Technical Latvian translator


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 06:24
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
Thanks Valters Nov 12, 2006

I implemented the settings you suggested. So far I haven't noticed any changes. Wf continues to segment two sentences into one segment in certain places.
But when I try Trados 7,5, I notice Trados has no difficulties with the same text. So Wf segmentation rules are definitely not trados-compatible.
The same text was translated also on another machine with a different Wf and different settings, and the result is the same an in my case.

Sorry I cannot cite the text, as it is confidential, but they are normal sentences of the ". T"-model.

Regards
Heinrich


Direct link Reply with quote
 

Valters Feists  Identity Verified
Latvia
Local time: 06:24
Member (2005)
English to Latvian
+ ...
could it be...? Nov 12, 2006

Could it be that the sentences are separated by a non-breaking space (nbs) character instead of simple space? Check it by switching the inverted "P" button to on; the nbs characters then will be shown as degree characters, and normal spaces as middle dots. I think Wf cannot be trained to handle nbs's... my option would be to replace-dereplace them while translating.

Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 06:24
Member (2003)
Finnish to German
+ ...
TOPIC STARTER
No, they are normal spaces Nov 12, 2006

This phenomenom is so common that I did never think about it before, almost every document has such stombling stones for no apparent reason, except that the sentences include numbers and abbreviations.
Cheers
Heinrich


Direct link Reply with quote
 

Mick De Meyer  Identity Verified
Belgium
Local time: 05:24
English to Dutch
+ ...
More weird segmentation, any help? Apr 17, 2011

Hi all,

Here is a specific problem I'm having with segmentation: the curly brackets.

I often need to translate this kind of sentence string:

Sentence number one.{end_li}{li}Sentence number two.{end_li}{end_para}{breakline}{para}Sentence number three.

... and so on. However, Wordfast mysteriously decides that this is a segment:

Sentence number one.{

How on earth can I simply teach it to segment before the curly bracket? How is it at all logical to end a segment with an open bracket?

If anyone knows how this is done, I would be very grateful!


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Mysterious segmentation rules

Advanced search


Translation news related to Wordfast





Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs