International Translation Day 2017

Join ProZ.com/TV for a FREE event on September 26-27th celebrating International Translation Day! 50+ hours of content, Chat, Live Q&A & more. Join 1,000's of linguists from around the globe as ProZ.com/TV celebrates International Translation Day.

Click for Full Participation

Pre-segment/-translate -- your definition?
Thread poster: Samuel Murray

Samuel Murray  Identity Verified
Netherlands
Local time: 13:35
Member (2006)
English to Afrikaans
+ ...
Oct 24, 2008

G'day everyone

I would like to know what you understand by the words "pre-translation" and "pre-segmentation", specifically in the context of TTX files, in the context of uncleaned RTF files, and also in general (or in other CAT tools). I think people use these terms to mean different things, but I'd like to see if there is some kind of consensus.

Some background

In the Trados documentation I have, presegmentation is mentioned only once, and in the context the author really means "pretranslation". Pretranslation in Trados terminology can mean any of the following:

1. Autotranslate all segments for which a fuzzy or exact match in the TM exist.
2. Action #1 plus autotranslate all other segments with their own source text.
3. Autotranslate all segments with their own source text (using an empty TM).
4. Segment all text, but leave the segments untranslated/empty.
5. Combination of action #1 and action #4 (i.e. autotranslate all segments for which matches in the TM exist, and segment all other text, but leave those segments untranslated/empty).

So when a CAT tool claims to support TTX, but only pre-segmented, it is not using Trados terminology, and we have to ask ourselves that the CAT tool authors really mean.

DejaVu X's documentation does not have presegmentation, but it does speak of pretranslation. In DVX, pretranslation can refer to two things. The first is a process similar to #1 for Trados above. The second is that any bilingual format not translated in DVX is labelled "pretranslated" in DVX. So a TTX file that was translated by a human without TM will presumably be known as a "pretranslated TTX" file in DVX speak.

Wordfast's user manual does not mention presegmentation either, and it likewise speaks only of pretranslation, in the same sense as #1 above, with options to do some of the other operations too.

MemoQ's documentation does not mention presegmentation either, and it likewise speaks only of pretranslation. For operations within MemoQ, MemoQ uses the term in the same way as #1 in Trados. MemoQ also uses the term to refer to documents not created in MemoQ, but it is not clear on what the manual means by the term.

It strikes me that MemoQ's web site says it supports presegmented TTX, but neither Trados nor MemoQ itself defines presegmentation anywhere. It is not a Trados term.

Your definitions?

Yet many people use these terms. A search of the forums seems to reveal that some use the terms "presegment" and "pretranslate" interchangeably. In some forums the same word means multiple things. It would seem to me that most people think "pretranslate" means #3 above.

So which of the five options above do you think does it mean to (a) pretranslate and (b) presegment?

Looking forward to your replies.




[Edited at 2008-10-24 21:22]


Direct link Reply with quote
 

Pablo Bouvier  Identity Verified
Local time: 13:35
German to Spanish
+ ...
presgmentation vs. pretranslation Oct 24, 2008

I believe that presegmentation (to divide text in chunks following certain rules, usually punctuation and certain tags like tabs between others) and pretranslation (copying already translated text from aTM to the target segment of a translation unit) are two quite different things. IMHO, pre-segmentation does not need a TM, pretranslation does.

I am sorry, if some 'specialized CAT tool vendors' does not know how to use an adequate vocabulary. But we should take in mind too, that they are usually engineers and infomation technicians, but just not linguists...

My two cents.


Direct link Reply with quote
 

KSL Berlin  Identity Verified
Portugal
Local time: 12:35
Member (2003)
German to English
+ ...
Terms Oct 25, 2008

Part of your confusion here may arise from the fact that the terms are not mutually exclusive. You can have presegmentation WITH pretranslation as well. You can have presegmentation without it, however.

In pretranslation, the matches from a TM are applied where appropriate. Unrecognized text is left alone in many cases, unless, for example, one selects the TWB option to "copy source on no match". This yields a PARTIALLY segmented file.

However, if you are doing a TTX with MemoQ, DVX or other tools, as far as I know, with the current state of the art you will always need a fully presegmented file (some of these segments may be "pretranslated") in order to have access to all the text to be translated. DVX, etc. do not change the Trados segmentation in any way. With a partially segmented file, DVX will not give you access to the unsegmented text (nor to the numbers or dates which Trados skips).

BTW, Samuel, your comment in the other thread about a TTX being presegmented by definition or some such thing is off the mark. It is marked up as XML, but it is no more "segmented" than an ordinary text in MS Word is. Segmentation happens when, as Pablo said, certain rules are applied and the text is broken up into logical chunks or units for translation. Depending on the CAT we use, we have considerable control over what goes into these chunks.

I think the reason that the documentation you consulted talks mostly about pretranslation is that it is usually assumed that one has an existing Trados TM to be applied. It makes more sense to use the TM content in Trados first (as opposed to exporting the TM to DV or MemoQ and doing the pretranslation there), because the matching will be slightly better. However, for cases where the customer just wants an uncleaned file and there is no TM of value, there really is no "pretranslation" involved - the preparatory step is pure presegmentation.


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
Can you segment without target segments? Oct 25, 2008


4. Segment all text, but leave the segments untranslated/empty.


I understand the concepts in the same way as Pablo and Kevin so I will not add to this.

I am curious, however, about one of your definitions of segmentation, could you elaborate on it?

I can't see how you can segment a TTX file without inserting target segments (be it pretranslated or copies of the English).

For me, segmentation implies always inserting a target segment (100% match, fuzzy match or identical to source) but it's clear that you mean something else.

Thanks!

Daniel

PS. Edited to add. Most likely I don't understand because I use mainly SDL Trados. With other tools you can segment without inserting a translation but with Trados you can't, as far as I know.

[Edited at 2008-10-25 08:02]

[Edited at 2008-10-25 08:03]


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 08:35
English to Spanish
Segmenting TTX files Oct 25, 2008

Kevin Lossner wrote:

However, if you are doing a TTX with MemoQ, DVX or other tools, as far as I know, with the current state of the art you will always need a fully presegmented file (some of these segments may be "pretranslated") in order to have access to all the text to be translated. DVX, etc. do not change the Trados segmentation in any way. With a partially segmented file, DVX will not give you access to the unsegmented text (nor to the numbers or dates which Trados skips).


Some tools, like DejaVu or MemoQ, require that you process the TTX file with Trados and add segmentation to it.

Other tools, like Swordfish, are able to segment a TTX file without requiring Trados and give you access to text that has not been segmented with TagEditor/WorkBench.

Regards,
Rodolfo


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 08:35
English to Spanish
Segmenting with empty target Oct 25, 2008

dgmaga wrote:


4. Segment all text, but leave the segments untranslated/empty.


I can't see how you can segment a TTX file without inserting target segments (be it pretranslated or copies of the English).

For me, segmentation implies always inserting a target segment (100% match, fuzzy match or identical to source) but it's clear that you mean something else.


Some tools let you segment a TTX file and leave the target untranslated/empty. Leaving target empty can be an important advantage.

If you populate target with a copy of source text, it may not be easy to tell which segments need translation and which ones are already translated.

If you insert fuzzy matches in target, it may not be obvious that you need to fix the segment. In this case, having target empty and fuzzy matches in another window or panel is a better option.

If you have a 100% match or in-context exact match, then it is a good idea to put that match in target. But only if you know in advance that the translation is correct.

Regards,
Rodolfo


Direct link Reply with quote
 

KSL Berlin  Identity Verified
Portugal
Local time: 12:35
Member (2003)
German to English
+ ...
Accidental posting, please ignore/delete Oct 25, 2008

---

[Edited at 2008-10-25 10:43]


Direct link Reply with quote
 

KSL Berlin  Identity Verified
Portugal
Local time: 12:35
Member (2003)
German to English
+ ...
The content of target segments Oct 25, 2008

dgmaga wrote:
For me, segmentation implies always inserting a target segment (100% match, fuzzy match or identical to source) but it's clear that you mean something else.


As far as I recall, Trados can indeed segment a text and leave the target segment empty, but there are problems trying to use such a document in third-party software (at least the software I use). Thus the presegmentation that you & I practice always involves populating the target with TM content or the source content.

Rodolfo - thanks for the tip about Swordfish. I had heard of this elsewhere a while back but forgotten the details. Given the incompatibilities between Trados versions I must admit that the idea of using a third-party tool to presegment a TTX makes me a bit nervous. Then there are the cases where there is an INI file to apply. Thus I prefer to use the appropriate versions of SDL Trados (for which I own a few licenses, but the demo version works just as well) to do this work. Maybe this is excessive caution on my part, but without careful compatibility testing or access to plausible results of such testing, I'm not going to stick my neck out on this one. However, I do plan to look into this more when I can find the time.


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
I will have a look Oct 25, 2008

Kevin Lossner wrote:

dgmaga wrote:
For me, segmentation implies always inserting a target segment (100% match, fuzzy match or identical to source) but it's clear that you mean something else.


As far as I recall, Trados can indeed segment a text and leave the target segment empty, but there are problems trying to use such a document in third-party software (at least the software I use). Thus the presegmentation that you & I practice always involves populating the target with TM content or the source content.


Aha, I will have a look, there might be some setting somewhere.

As I use it, I do a pretranslation and there you have two options:

a) have only the translated sentences segmented (and untranslated segments reimain unsegemented).

b) Use the "Segement unknown" sentences and then untranslated sentences are segmented inserting the source text in the target.

I had even thought that you might have problems if you had empty segments in a TTX file.
[/quote]


Kevin Lossner wrote:
Rodolfo - thanks for the tip about Swordfish. I had heard of this elsewhere a while back but forgotten the details. Given the incompatibilities between Trados versions I must admit that the idea of using a third-party tool to presegment a TTX makes me a bit nervous.


I could not agree more. I would not use a third-party unless I could fully test the whole procedure, including making sure that the final TTX file can further processed with TagEditor on the client's side.

Daniel

[Edited at 2008-10-25 11:04]


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 13:35
Member (2004)
English to Slovenian
+ ...
Presegmentation ... Oct 25, 2008

[what does] it mean to (a) pretranslate and (b) presegment?

I want to make it (sound) simple:
i) pretranslate means complementing the source segments with the target from an existing TM or TMs to the source (according to conditions, set down by the user of course).
ii) presegment is baloooney - or a misnomer: the segmentation - i.e. separation of the input stream into cohesive units, is not that difficult, that it could not be done just on time - except if software really sucks. So the term "presegment" must mean something else to ewhoever has been floating it. And I dont think I need that something else.


Direct link Reply with quote
 

tectranslate ITS GmbH
Local time: 13:35
German
+ ...
Presegmentation vs. Pretranslation Jan 19, 2009

Kevin Lossner wrote:
...You can have presegmentation WITH pretranslation as well. You can have presegmentation without it, however...


I see others have mentioned this as well, but the latter "can" should be a "cannot".

All the best,

Benjamin


Direct link Reply with quote
 

Klaus Herrmann  Identity Verified
Germany
Local time: 13:35
Member (2002)
English to German
+ ...
Presegmenting without Trados? Jan 24, 2009

BIG thanks for Sworffish, Rodolfo! It saved my behind for an Idiom project that has been exported to TTX, and Trados (Versions 6.5, 7 and 8) messed up the tags for good.

Kevin Lossner wrote:
Rodolfo - thanks for the tip about Swordfish. I had heard of this elsewhere a while back but forgotten the details. Given the incompatibilities between Trados versions I must admit that the idea of using a third-party tool to presegment a TTX makes me a bit nervous.


Yes, unless you need to get the file back into Idiom...

Are there any other tools for presegmenting TTXs without Trados?


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Pre-segment/-translate -- your definition?

Advanced search







SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search