Pages in topic:   [1 2 3] >
How are you supposed to fix these kinds of CAT tool segment problems?
Thread poster: LegalTransform

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
Jul 8, 2015

I often get stuff like this:

SEGMENT 1: Introducción a los Sistemas de
SEGMENT 2: Información

One possible translation: Introduction to the Computer Systems

So what to do:

You can't do this:
SEGMENT 1: Introduction to the Computer Systems
SEGMENT 2: [blank]

Because then the word "Información" is in the TM as [blank] and the word "Información" is incorrectly added to the translation of Segment 1. In addition, at least with Trados, you get "Introduction to the Computer Systems Información"

You can't do this:
SEGMENT 1: Introduction to the Computer
SEGMENT 2: Systems
Because then the word "Información" is in the TM as "Systems" which is incorrect.

Another example:
SEGMENT 1: Dieter Dieter ist am Samstag
SEGMENT 2: in einem Krankenhaus in Paris
SEGMENT 3: gestorben

Possible translation: Dieter Dieter died on Saturday at a hospital in Paris.

You can't do this:
SEGMENT 1: Dieter Dieter is on Saturday
SEGMENT 2: at a hospital in Paris
SEGMENT 3: died

You can't do this:
SEGMENT 1: Dieter Dieter died on Saturday
SEGMENT 2: at a hospital in Paris
SEGMENT 3: [blank]

because now whenever poor Dieter does anything on a Saturday, the TM will think he is dead... and now you have the same problem with another empty segment

[Update with solution from Miguel Carmona and Emma Goldsmith:

Miguel Carmona wrote:

Have you enabled source editing in Studio 2011?

From Studio 2011 Help:

1.Select Project > Project Settings.
2.In the Project Settings dialog box, select Project from the navigation tree.
3.Select the Allow source editing for supported file types check box.
4.Click OK to close the Project Settings dialog box and save your changes.

[Edited at 2015-07-08 21:19 GMT]




[Edited at 2015-07-08 21:29 GMT]

[Edited at 2015-07-08 22:20 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 19:58
Member (2009)
Dutch to English
+ ...
converted PDFs often contain spurious line endings, which can result in incorrect segmentation Jul 8, 2015

Hi Jeff,

It depends on what CAT tool you're using and what type of document these cut up sentences are in. In most CAT tools you can simply join these types of segments (in CafeTran, e.g., I would just click Alt+J every time I came across one of these incorrectly segmented segments. Alt+J joins, and Alt+S splits.).

However, this can get tedious. It is therefore often better to deal with the source of the problem, which is usually a poorly converted PDF. Often, when converting a PDF to a .doc or .docx, a converter will add spurious line endings, and these are what are creating all these extra segments.

You can either use a better PDF converter, tweak the settings in your current PDF converter, or use a special tool to fix them in the Word document after conversion. Currently, the best tool to fix these things is TransTools, which has a special module called "Unbreaker" exactly for this kind of thing:

http://www.translatortools.net/word-unbreaker.html

[Edited at 2015-07-08 16:08 GMT]


 

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
However Jul 8, 2015

If the agency provides the CAT file, shouldn't I be able to charge extra for all this wasted time? It just seems like more and more we are spending time formatting stuff for the computer and for the sake of some fabled future benefit rather than concentrating on providing the client/reader with a good translation.

I sometimes find myself forced to translate things in a weird and off way or making linguistic compromises just so they fit with the tool and that is so incredibly wrong.

Instead of concentrating and focusing on language, I also have to figure out ways to word things so that they comply with the parameters of the tool and make sure that they are generic enough in the event they are reused again in a different context, resulting in good, creative and innovative translations being tossed aside in favor of vapid and boring language that the machine can assimilate, process and later regurgitate.

Michael Beijer wrote:



[Edited at 2015-07-08 16:22 GMT]


 

Miguel Carmona  Identity Verified
United States
Local time: 11:58
English to Spanish
... Jul 8, 2015

Jeff Whittaker wrote:

I often get stuff like this:

SEGMENT 1: Introducción a los Sistemas de
SEGMENT 2: Información

One possible translation: Introduction to the Computer Systems

So what to do:

You can't do this:
SEGMENT 1: Introduction to the Computer Systems
SEGMENT 2: [blank]

Because then the word "Información" is in the TM as [blank] and the word "Información" is incorrectly added to the translation of Segment 1. In addition, at least with Trados, you get "Introduction to the Computer Systems Información".


This is what you need to do on the source side:
(Simply cut and paste "Información")
SEGMENT 1: Introducción a los Sistemas de Información
SEGMENT 2: [blank]

So, you end up with this on the target side:
SEGMENT 1: Introduction to the Computer Systems
SEGMENT 2: [blank]

At least in SDL Studio 2014 is very easy. It takes a couple of mouse clicks, but I am sure to some people it will feel like a lot of work.


 

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
That would be a wonderful solution Jul 8, 2015

But I've never been able to do that (Studio 2011) because it won't let me change the target segments.

Is there a way to unlock the segments? Or is this only possible with Studio 2014?


Miguel Carmona wrote:



[Edited at 2015-07-08 16:30 GMT]

[Edited at 2015-07-08 16:31 GMT]


 

Giles Watson  Identity Verified
Italy
Local time: 20:58
Italian to English
Edit the source text Jul 8, 2015

Jeff Whittaker wrote:

And please don't tell me that I'm supposed to waste my time editing and combining all these segments just for the sake of the computer!



These are source text issues, not your CAT's fault.

It happens when the source text has been exported as a Word document from a DTP program or similar. I've just finished a largish job updating the English version of a book I translated a few years ago and I received the new Italian text with all sorts of irritatingly placed hard returns.

What I generally do is remove the unwanted breaks in Word with a search and replace routine, taking care not to delete the ones that are actually needed, before I import the text into my CAT. Charge more for source texts of this kind.


 
Solution with "Virtual segment joining" Jul 8, 2015

Hi Jeff,

of course it would be perfect to have the problems fixed on the source language side, but in real life that's not always an option.

How to fix this during the translation depends on the CAT tool you use. With STAR Transit, you can simply solve this by using the virtual segment joining feature: Then the two segments are virtually (and only virtually) treated as one. So you can translate correctly as "Introduction to the Computer Systems". (By the way, not with "a couple of mouse clicks", but with one).

"Virtually (and only virtually)" means: In the TM, there are still two physical segments that are virtually (and only virtually) "connected".

If a follow-up project contains the same combination, this translation is re-used:
SEGMENT x: Introducción a los Sistemas de
SEGMENT y: Información
=> SEGMENT x+y: Introduction to the Computer Systems

If a follow-up project contains only one of these segments, this translation is NOT used (but maybe another existing match):
SEGMENT z: Información => [not translated if "Información => Information" was never translated before]
or
SEGMENT z: Información => Information [if "Información => Information" is in the TM]
but never ever
SEGMENT z: Información => Systems
or
SEGMENT z: Información => [blank]


Please excuse me if the explanation sounds weird. In "real life", this feature is very simple and obvious, but you can hardly describe it with words...

Regards,

O.N.

[Edited at 2015-07-08 16:34 GMT]

[Edited at 2015-07-08 16:35 GMT]

[Edited at 2015-07-08 16:36 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 19:58
Member (2009)
Dutch to English
+ ...
Unbreaker Jul 8, 2015

Giles Watson wrote:

Jeff Whittaker wrote:

And please don't tell me that I'm supposed to waste my time editing and combining all these segments just for the sake of the computer!



These are source text issues, not your CAT's fault.

It happens when the source text has been exported as a Word document from a DTP program or similar. I've just finished a largish job updating the English version of a book I translated a few years ago and I received the new Italian text with all sorts of irritatingly placed hard returns.

What I generally do is remove the unwanted breaks in Word with a search and replace routine, taking care not to delete the ones that are actually needed, before I import the text into my CAT. Charge more for source texts of this kind.



Hi Giles,

have you tried Unbreaker yet? It's actually very clever.

See: http://www.translatortools.net/word-unbreaker.html
And for a short demo: http://www.translatortools.net/how/word-unbreaker-mini-demo.html (animated GIF)


 

Miguel Carmona  Identity Verified
United States
Local time: 11:58
English to Spanish
... Jul 8, 2015

Jeff Whittaker wrote:

But I've never been able to do that (Studio 2011) because it won't let me change the target segments.

Is there a way to unlock the segments? Or is this only possible with Studio 2014?


Here is how you proceed:

SEGMENT 2:
1) Choose "Edit Source"
2) Cut "Información"

SEGMENT 1:
1) Choose "Edit Source"
2) Paste "Información"

I do not think it was possible in Studio 2011.

Good luck.


 

Georgi Kovachev  Identity Verified
Bulgaria
Local time: 21:58
Member (2010)
English to Bulgarian
+ ...
You are absolutely right, but Jul 8, 2015

Jeff Whittaker wrote:

If the agency provides the CAT file, shouldn't I be able to charge extra for all this wasted time?



[Edited at 2015-07-08 16:22 GMT] [/quote]

What would you do if the agency told you that they have a long-term contract with fixed prices and you cannot renegotiate? You will face a take-it-or-leave-it situation.

One solution would be to cancel all discounts for that client if you have granted any.


 

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
Nope, doesn't work in 2011 Jul 8, 2015

But that is an interesting new feature. Now if they would only eliminate the language limit...

Miguel Carmona wrote:




 

Sergei Tumanov  Identity Verified
Local time: 21:58
English to Russian
+ ...
I simply merge two segments in Studio 2011 Jul 8, 2015

It works very well for me.
Never had problems with this ...
I select two source segments, right-click, and select "merge segments (Ctrl+Alt+S).

added later: or ... merge three segments....

[Edited at 2015-07-08 17:37 GMT]


 

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
The merge segment option is greyed out and Jul 8, 2015

does not work. I do have Split Segment, but not merge segment.

Sergei Tumanov wrote:

It works very well for me.
Never had problems with this ...
I select two source segments, right-click, and select "merge segments (Ctrl+Alt+S).

added later: or ... merge three segments....<


[Edited at 2015-07-08 18:05 GMT]


 

Sindy Cremer

Member (2008)
English to Dutch
+ ...
force error message Jul 8, 2015

I simply hit crtl+alt+enter; you get an error message (white x in a red circle: 'segment has not been translated') but the line is left blank in the translated document.

 

LegalTransform  Identity Verified
United States
Local time: 14:58
Member (2002)
Spanish to English
+ ...
TOPIC STARTER
That's an interesting solution.... Jul 8, 2015

.... but what happens when the editor or PM says - you missed/skipped a segment?


Sindy Cremer wrote:

I simply hit crtl+alt+enter; you get an error message (white x in a red circle: 'segment has not been translated') but the line is left blank in the translated document.


 
Pages in topic:   [1 2 3] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How are you supposed to fix these kinds of CAT tool segment problems?

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search