Pretranslate with Machine Translation: not all segments are translated
Olaf Reibedanz Argentina Local time: 01:09 Member (2003) English to German + ...
Oct 21, 2011
Hi everybody,
I am having the following problem when using the "Pretranslate" function in DVX 2: Only some segments are translated, the others remain empty. For example, the first 3 segments are translated, the next 4 are empty, the following 2 are translated, the next 5 are empty again, etc. The pattern seems completely random. And the result is different each time I repeat the same procedure from scratch, with the same file.
Has anybody ever experienced this problem? Could it be related to the fact that my current internet connection is a bit unstable? Or do you have any other idea what might be causing the problem?
Many thanks in advance!
Kind regards,
Olaf
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Selcuk Akyuz Turkey Local time: 07:09 Member (2006) English to Turkish + ...
Google Translate API
Oct 21, 2011
Hi Olaf,
I don't use Google Translate for pretranslation but perhaps it is something related to the new restrictions of Google Translate. But there is always another solution, export to External View, translate with Google and then reimport. It may require replacing line breaks with paragraph marks, and converting to table format in Word.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Victor Dewsbery Germany Local time: 06:09 German to English + ...
Google turning the screw
Oct 21, 2011
This has been reported on the beta list, too. Apparently, it results from access restrictions imposed by Google. Although the wind-down of the "Google Translate API" was announced for December, its seems that Google has already started to restrict the number of "calls" it will take from DVX2 in any period of time, so when the quota has been used up, it refuses to respond.
I hardly ever use GT for confidentiality reasons, so it wasn't me "eating up" your quota!
I know that Atril is weighing up the possible ways forward. They may find your report interesting - if only to work out how long the defined time periods for the GT quota are. From your account, it seems almost as if GT imposes a quota of a certain number of calls per minute, cutting you off when the quota is up but letting you in again and counting again when the next minute starts.
If that is the case and you want the whole file GT'd, you may be able to outsmart the system roughly like this:
1. Pretranslate with GT activated, and let it do as many segments as it is willing to do.
2. In the row selector (in the middle above the sentence grid) select "SQL statement" and use the function "Build expression" to select all segments that are not machine translated.
3. Pretranslate these segments with GT activated, but taking care to ensure that you have ticked "Limit to current view". Again, it will probably do some segments and leave some out.
4. Repeat steps 2 and 3 until you have got the lot done.
Alternatively, of course, you could leave the initial pass as it is or pretranslate without GT, and only call GT for individual rows in which you want a second opinion (with CTRL-G).
[Edited at 2011-10-21 14:08 GMT]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Grzegorz Gryc Poland Local time: 06:09 French to Polish + ...
HTML EV
Oct 21, 2011
Selcuk Akyuz wrote:
I don't use Google Translate for pretranslation
I often do.
My texts are often EU related now and don't require confidentiality.
but perhaps it is something related to the new restrictions of Google Translate.
Yep.
In fact, the GT interface stopped to work in the batch mode starting from Monday, AFAIR.
10 TU in a run is a max.
But there is always another solution, export to External View, translate with Google and then reimport. It may require replacing line breaks with paragraph marks, and converting to table format in Word.
Export EV as HTML.
It works as a charm and a very basic intelligence is required...
Cheers
GG
[Edited at 2011-10-21 15:53 GMT]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Grzegorz Gryc Poland Local time: 06:09 French to Polish + ...
No sound workaround inside DVX2
Oct 21, 2011
Victor Dewsbery wrote:
I know that Atril is weighing up the possible ways forward. (...)
Hopefully.
If that is the case and you want the whole file GT'd, you may be able to outsmart the system roughly like this:
1. Pretranslate with GT activated, and let it do as many segments as it is willing to do.
2. In the row selector (in the middle above the sentence grid) select "SQL statement" and use the function "Build expression" to select all segments that are not machine translated.
3. Pretranslate these segments with GT activated, but taking care to ensure that you have ticked "Limit to current view". Again, it will probably do some segments and leave some out.
4. Repeat steps 2 and 3 until you have got the lot done.
For a 100000 chars job, the hell will freeze.
It's unworkable.
Cheers
GG
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Victor Dewsbery Germany Local time: 06:09 German to English + ...
Business case?
Oct 21, 2011
Grzegorz Gryc wrote:
For a 100000 chars job, hell will freeze. It's unworkable.
Agreed. On the other hand, I don't quite see the business case for a professional translator with a powerful CAT tool to send a 100,000 character job to GT lock stock and barrel.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Grzegorz Gryc Poland Local time: 06:09 French to Polish + ...
The subject matters...
Oct 21, 2011
Victor Dewsbery wrote:
Grzegorz Gryc wrote:
For a 100000 chars job, hell will freeze. It's unworkable.
Agreed. On the other hand, I don't quite see the business case for a professional translator with a powerful CAT tool to send a 100,000 character job to GT lock stock and barrel.
I insist it's heavily subject related.
The EU corpus is very, very big and GT often gives better results than AA, DeepMiner et Cie even if they're based on a huuuge EU corpus.
And GT is faster in the interactive work.
For every new segment, I earn at least 2-5 seconds (my set ot DBs is really huge and the DVX2 on the fly operations take too much time).
But one thing is sure. GT is not a brain replacement.
As it's not configurable, some results are obviously unusable but a sound CAT tool as DVX2 permits to easily insert AutoWrite/terminology portions and to correct the machine.
Of course, if one knows hot to correct the machine...
So, almost 8000 words (no reps) today.
By the other hand, today I had a small urgent technical job (300 words) with no serious GT hits at all.
A 10000 chars job like that processed with GT would be a nightmare.
In this case, my own TMs and TBs would be incontestably better.
PS.
I never use GT for confidential jobs.
I think this approach is evident for everybody...
Cheers
GG
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Olaf Reibedanz Argentina Local time: 01:09 Member (2003) English to German + ...
TOPIC STARTER
More questions
Nov 1, 2011
Thanks everybody for your input!
you may be able to outsmart the system roughly like this:
1. Pretranslate with GT activated, and let it do as many segments as it is willing to do.
2. In the row selector (in the middle above the sentence grid) select "SQL statement" and use the function "Build expression" to select all segments that are not machine translated.
3. Pretranslate these segments with GT activated, but taking care to ensure that you have ticked "Limit to current view". Again, it will probably do some segments and leave some out.
4. Repeat steps 2 and 3 until you have got the lot done.
Unfortunately, this method doesn't work. Even after repeating the procedure a dozen times, most of the segments are still left untranslated.
But there is always another solution, export to External View, translate with Google and then reimport. It may require replacing line breaks with paragraph marks, and converting to table format in Word.
Can you explain this with more detail? What I have tried so far is this:
1) Copy all source segments to target (F5)
2) Export > External view (as a RTF table)
3) Highlight all the target text and process it with GT4T.
But this method has two shortcomings:
- Usually the result is a segment mismatch between source and target: The translation of segment 1 ends up in segment 2; the translation of segment 2 ends up in segment 3; the translation of segment 3 ends up in segment 4; etc. Do you know what I mean, and do you know why this happens?
- While GT4T is working, I cannot do anything else on the computer because the machine translated text is always inserted in the place where my cursor is located at the moment GT4T finishes processing the text.
Of course, these two shortcomings can be dealt with (by just sitting and waiting till GT4T has finished its work, and by then fixing the misaligned segments manually), but they are still a bit annoying.
Export EV as HTML. It works as a charm and a very basic intelligence is required...
So if I understand you correctly, if I want to process the target column with GT4T in an external view file, HTML is better than RTF? Can you explain me why?
Cheers,
Olaf
[Edited at 2011-11-01 17:50 GMT]
[Edited at 2011-11-01 17:51 GMT]
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Grzegorz Gryc Poland Local time: 06:09 French to Polish + ...
GT Toolkit
Nov 1, 2011
Olaf Reibedanz wrote:
Export EV as HTML. It works as a charm and a very basic intelligence is required...
So if I understand you correctly, if I want to process the target column with GT4T
I probably wrote it too fast and I missed a point (approx. in the same time, I discussed a very similar question on dejavu-l...).
You should use Google Translator Toolkit, not GT4T.
You should also process the source and copy-paste the results as target in a copy of your initial EV file.
Then reimport.
The GT results will be shown as unpainted rows, so you should not pretranslate the project after the GT data results are imported.
BTW.
Be careful.
Don't split/join segments before you reimport the EV.
You should also check the codes immediately after the import, GTT screws up spaces around the codes (additional spaces are inserted) and may damage some codes.
in an external view file, HTML is better than RTF? Can you explain me why?
HTML is smaller and easier to handle in GTT.
GTT has a 1 MB limit for a single file.
Cheers
GG
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Google Translate will become a paid service in December. Right now they are preparing for that by instituting a limit of the number of queries you can make in a certain amount of time. Since Déjà Vu X2 uses Atril’s own Google Translate API key to send the queries the maximum limit is shared among all the users of Déjà Vu X2, which means that it is reached very quickly. When the limit is reached Google Translate stops responding for a time.
In the next build we will allow users to configure Déjà Vu X2 to use their own GT API keys, which means that individual users will need to open an account with GT to use it. After December, these accounts will no longer be free.
In early 2012 we will add support for other MT services apart from Google.
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Dallas Cao China Local time: 12:09 Member (2007) English to Chinese + ...
Enable preview
Jan 11
Hi Olaf,
If you don't want GT4T to automatically insert the translation, you can go to the setup screen->Machine Translation->preview options choose show preview, show alternative translations or show both bing and google translation.
GT4T has a new feature of combining a user-defined glossary with macine translation so the terms are always translated correctly. Check it out.
Dallas
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
To report site rules violations or get help, contact a site moderator:
Save time by automatically extracting terms. 15% off!
SDL MultiTerm Extract 2011 allows you to automatically create candidate term lists from your existing documentation. This removes the manual effort involved with traditional terminology creation, allowing you to rapidly add terms to SDL MultiTerm.
memoQ translator pro is the premium product for professionals. It is Kilgray's best-selling tool among freelance translators: you get all the functionality available in memoQ in your local environment plus the ability to work on remote servers.