Experimenting with DeepL: trying to improve inter-segment term consistency
Thread poster: Hans Lenting

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
+ ...
Mar 29

By default, DeepL will segment the input that you present to the system:

split

So, if you send these 4 related segments, they will be treated as 'not related', which can lead to inconsistent translation of terms (here: Gesamtanlage):

Maßzeichnung der Gesamtanlage
Gesamtanlage starten
Gesamtanlage nach Sicherheitsstopp starten
Beschickung der Gesamtanlage


0

This result matches the results of the 4 individual segments:

2

3

4

Screen Shot 2018-03-29 at 10.06.20

However, when you force DeepL to interpret the 4 segments as 1 segment, thus making them related, you'll get a better result ('Gesamtanlage' is translated consistently):

Screen Shot 2018-03-29 at 10.12.51

(On a side note: This approach has the disadvantage that the chapter titles in the example are interpreted as instructions.)

Of course, the peepl at DeepL are working hard on improving inter-segment consistency (which is considered to be the Holy Grail of MT).

In the meantime we could try to let CT offer several segments at the same time and re-segment the translated result to the TM.


Direct link Reply with quote
 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
+ ...
TOPIC STARTER
English as the inter-language Mar 29

As long as DeepL makes this kind of 'mistakes', we human translators have nothing to fear:

Screen Shot 2018-03-29 at 11.07.41


Direct link Reply with quote
 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
+ ...
TOPIC STARTER
Chopping up your source text in chunks of 5000 characters Apr 5

Here are some nice macros to chop up your source text in chunks of 5000 characters, which is the maximum number of characters to insert in DeepL web:

https://forum.keyboardmaestro.com/t/would-like-to-create-a-macro-chop-up-text-in-approx-400-word-segments/9946/7

One idea that comes to mind is to save these chunks of 5000 characters and the DeepL-generated translations as bitexts and import these in CafeTran Espresso 2018.

I'd have to investigate if this approach would improve terminological consistency.


Direct link Reply with quote
 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
+ ...
TOPIC STARTER
Example Apr 5

A Terminotix bitext looks like this:

<?xml version="1.0" encoding="utf-8"?>
<bitext version="1.2">
<meta>
<langsrc>ENG</langsrc>
<langtgt>DEU</langtgt>
</meta>
<segments>
<seg id="1">
<src>Source sentence one. Source sentence two. Source sentence three.</src>
<tgt>Target sentence one. Target sentence two. Target sentence three.</tgt>
</seg>
</segments>
</bitext>

Import of a sample bitext:

1

2

Of course it would be nice, to have the segmentation automatically adjusted.

[Edited at 2018-04-05 19:05 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Natalie[Call to this topic]

You can also contact site staff by submitting a support request »

Experimenting with DeepL: trying to improve inter-segment term consistency

Advanced search






PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search