How to use two different translations for the same word?
Thread poster: Harklas
Harklas
Local time: 16:19
Jun 14, 2010

Hi!

I have OmegaT-2.0.5_2

It seems I can't give the same English word two different translations in the target language.

A hypothetical example of the problem at hand:
There are two instances of the word "jump", one as a noun and one as a verb. I translate the first jump to "hopp" and OmegaT automatically makes the second "hopp". Which is wrong, as the verb "jump" is "hoppa". So I change the second jump to "hoppa", and then OmegaT changes the first jump to "hoppa" as well, while it should be "hopp".

It's like I have to choose between:
Jump = Hopp
Jump = Hopp

and

Jump = Hoppa
Jump = Hoppa

but what I want is:

Jump = Hopp
Jump = Hoppa

How do I do that?

Any quick input would be of great help


Direct link Reply with quote
 

Susan Welsh  Identity Verified
United States
Local time: 10:19
Member (2008)
Russian to English
+ ...
I guess these are one-word segments? Jun 14, 2010

If not one-word segments, I don't see why it would be a problem.
If they are one-word segments, then OmegaT automatically translates the word consistently (unfortunately). I raised the same issue on the OmegaT yahoo users forum recently, and was told that you have to alter the source document in some way to flag the difference in meanings: such as #jump vs. jump without the number sign.

It's kind of a pain, but there you have it.

Susan


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:19
Member (2006)
English to Afrikaans
+ ...
@Harklas Jun 14, 2010

Harklas wrote:
It seems I can't give the same English word two different translations in the target language.


I believe the OmT developers are aware of this behaviour, but they don't regard it as a problem, and consequently they are not in a hurry to produce any type of built-in solution for it (for after all, a solution requires a problem). One solution might be to have a function in OmegaT that extracts all repeating segments to a separate file, so that a user can check them beforehand, or a function that allows the user to press a shortcut to jump to repeating segments (translated or not), so that he can check them beforehand or afterwards.

It is a Susan said... you just have to be very, very watchful and when you see an automatically translated segment approaching, quickly open it and add a marker like ### to it so that you can check it and correct it in the compiled version of the file.

If your one-word segments occur in mid-paragraph, you can merge that segment with the next or previous segment, by adding a segmentation rule. If you encounter this frequently, you can try my little script for it: http://leuce.com/tempfile/omtautoit/segadder.zip


Direct link Reply with quote
 
Harklas
Local time: 16:19
TOPIC STARTER
How can this not be a problem? Jun 15, 2010

I was even ashamed to ask the question, thinking it must be a setting somewhere and that I just didn't look around hard enough or RTFM.

Every language is full of words with multiple meanings in most languages. SDLX lite doesn't behave like this for one thing, and I doubt any other TM software does either. They let you choose for each instance if you want alternative 1), 2) or 3) etc

Anyway, thanks for your answers! I'll try to join segments, bundling the problematic one-liners with the previous or subsequent segments to make them unique. I should have thought of that myself, but when your brain isn't in place, it's good that you friendly souls on the Internet are

Exactly what does your script do, btw?

[Edited at 2010-06-15 11:23 GMT]


Direct link Reply with quote
 
Tim Mott
Canada
Local time: 10:19
French to English
frustrating behaviour! Jun 15, 2010

I also find this behaviour frustrating, and even dangerous. It happens often enough that I have a source text with titles split into several one- or two-word paragraphs, and I have to be careful to translate these to avoid 'contaminating' the rest of the document.

To make up an example, suppose we have

CHEF
DE
BUREAU

and

CHEF
DE
POLICE.

These must end up as "Office Manager" and "Chief of Police" ... but there is a risk of getting absurdities like "Chief of Manager" or "Office Police"!

I can't see how the OmegaT team doesn't recognize this behaviour as a flaw. The suggestion of inventing arbitrary segmentation rules isn't feasible for large projects, and in my opinion only highlights the fact that there is a major issue.

It seems to me that it would be very(!) simple to implement a feature whereby you could simply 'lock' a particular segment to isolate it from the rest of the document, and thus avoid propagating the translation both from and to other segments.

Tim


Direct link Reply with quote
 
Harklas
Local time: 16:19
TOPIC STARTER
indeed Jun 15, 2010

I'm using SDLX now. It can translate everything automatically for you using old matches if you _want_ it to, but the default is that you have to manually approve each translation from a list of previous matches. Which so far has been the better option in every project I've done.

Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 16:19
Member (2007)
English to French
+ ...
It is a problem, but the solution isn't easy Jun 16, 2010

Samuel Murray wrote:
I believe the OmT developers are aware of this behaviour, but they don't regard it as a problem, and consequently they are not in a hurry to produce any type of built-in solution for it (for after all, a solution requires a problem).

It is considered a serious problem.

Being a serious problem doesn't mean there is an easy solution.

We are (slowly admittedly) working on it.

Didier


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 16:19
Member (2007)
English to French
+ ...
There is a single segment for duplicates Jun 16, 2010


It seems to me that it would be very(!) simple to implement a feature whereby you could simply 'lock' a particular segment to isolate it from the rest of the document, and thus avoid propagating the translation both from and to other segments.

Except that, if the word 'Hello' appears several times in a document, there will be a single 'Hello' segment in memory. ('Locking' it would not thus achieve anything.)

That's a not choice of the current developers, that's how it has been designed initially.

So, the solutions is not 'very' simple.

Didier


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:19
Member (2006)
English to Afrikaans
+ ...
What my script does Jun 16, 2010

Harklas wrote:
Exactly what does your script do, btw?


My script makes it easier to add exceptions to the segmentation rules. Basically, if you press the script's shortcut key, you're asked for some text that you want to be an exception, and then my script adds it to the segmentation rules for you. The script is also useful for adding abbreviations to the segmentation rules. My script would be far less necessary if adding segmentation rules was less user-unfriendly in OmegaT, but it is, and so this script is useful. The script has a readme file in the zip file explaining everything.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:19
Member (2006)
English to Afrikaans
+ ...
@Didier Jun 16, 2010

Didier Briel wrote:
Samuel Murray wrote:
I believe the OmT developers are aware of this behaviour, but they don't regard it as a problem, and consequently they are not in a hurry to produce any type of built-in solution for it (for after all, a solution requires a problem).

It is considered a serious problem. Being a serious problem doesn't mean there is an easy solution.


This serious problem has been a thorn in the side of OmegaT users for quite a few years now. I've checked the User Manual briefly but I find no mention of it, so it would seem that new users are expected to just discover it by themselves. Perhaps a section in the User Manual called "Limitations of OmegaT" or "Shortcomings of OmegaT" or "Known problems/bugs in OmegaT" would be useful.

The fact that there is no easy solution does not mean that a temporary solution should not be considered:

1. I have lobbied for a text extraction feature for many years now, and such a feature could then be used by translators to avoid (not permanently solve, but avoid) the problem of unique repeating segments.
2. If it is possible for OmegaT to determine (from the source documents) which segments are repeating, it could add those segments to the project_save.tmx file in advance, with "[repeating]" in front of the target text, so that users can at least be made aware of those segments, so that they can decide for themselves how to solve it.

The biggest problem here IMO is not the fact that OmegaT can't translate non-unique segments uniquely, but that there is no way for the user to safely be aware of such segments.


Direct link Reply with quote
 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


How to use two different translations for the same word?

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search