Experimenting with DeepL: trying to improve inter-segment term consistency
Thread poster: Hans Lenting

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Mar 29

By default, DeepL will segment the input that you present to the system:

mvm9fr6fou73w5xutnwb.png

So, if you send these 4 related segments, they will be treated as 'not related', which can lead to inconsistent translation of terms (here: Gesamtanlage):

Maßzeichnung der Gesamtanlage
Gesamtanlage starten
Gesamtanlage nach Sicherheitsstopp starten
Beschickung der Gesamtanlage


juynqwkinffevslhotkz.png

This result matches the results of the 4 individual segments:

nyqblpm4ai8bt7wk1mov.pngcdmi6mwridcjasahiwup.pngn8uiwqlyal1d7jezhotz.pngy8gkxqfnjq0ly81vv2xc.png

However, when you force DeepL to interpret the 4 segments as 1 segment, thus making them related, you'll get a better result ('Gesamtanlage' is translated consistently):

wohkju1ny13iurt4msq3.png

(On a side note: This approach has the disadvantage that the chapter titles in the example are interpreted as instructions.)

Of course, the peepl at DeepL are working hard on improving inter-segment consistency (which is considered to be the Holy Grail of MT).

In the meantime we could try to let CT offer several segments at the same time and re-segment the translated result to the TM.


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
English as the inter-language Mar 29

As long as DeepL makes this kind of 'mistakes', we human translators have nothing to fear:

knof8afz0cwuuueetwmp.png


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Chopping up your source text in chunks of 5000 characters Apr 5

Here are some nice macros to chop up your source text in chunks of 5000 characters, which is the maximum number of characters to insert in DeepL web:

https://forum.keyboardmaestro.com/t/would-like-to-create-a-macro-chop-up-text-in-approx-400-word-segments/9946/7

One idea that comes to mind is to save these chunks of 5000 characters and the DeepL-generated translations as bitexts and import these in CafeTran Espresso 2018.

I'd have to investigate if this approach would improve terminological consistency.


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
TOPIC STARTER
Example Apr 5

A Terminotix bitext looks like this:

<?xml version="1.0" encoding="utf-8"?>
<bitext version="1.2">
<meta>
<langsrc>ENG</langsrc>
<langtgt>DEU</langtgt>
</meta>
<segments>
<seg id="1">
<src>Source sentence one. Source sentence two. Source sentence three.</src>
<tgt>Target sentence one. Target sentence two. Target sentence three.</tgt>
</seg>
</segments>
</bitext>

Import of a sample bitext:

olkt91xh1dchbn17llma.pngmf8fxuw96rw4bmn288sn.png

Of course it would be nice, to have the segmentation automatically adjusted.

[Edited at 2018-04-05 19:05 GMT]


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Experimenting with DeepL: trying to improve inter-segment term consistency

Advanced search






WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search