Personalising machine translation by feeding in translation memories or bilingual files?
Thread poster: Mark Bossanyi

Mark Bossanyi  Identity Verified
Bulgaria
Local time: 02:53
Member (2008)
French to English
+ ...
Jun 11

Does anyone know of a (neural?) machine translation system that I can personalise or train to translate in my own style by feeding in the translation memories or CAT tool bilingual (xliff) files that I have compiled over the years?

[Edited at 2019-06-11 07:02 GMT]

[Edited at 2019-06-11 07:14 GMT]


 

DZiW
Ukraine
English to Russian
+ ...
redundancy Jun 12

Mark, a modern neural network is a set of factor-weight-possible outcome bruteforcing, usually set at random for a possible/approved solution. Therefore, it must know all the characteristics guessing their non-equal 'importance' weights followed by an individual (constant human-assisted) feedback, which is very limited and inefficient. Shortly, it cannot know for sure what is RIGHT for you in every case/context. While hyped, the "AI" possibilities are rather mediocre nowadays.
... See more
Mark, a modern neural network is a set of factor-weight-possible outcome bruteforcing, usually set at random for a possible/approved solution. Therefore, it must know all the characteristics guessing their non-equal 'importance' weights followed by an individual (constant human-assisted) feedback, which is very limited and inefficient. Shortly, it cannot know for sure what is RIGHT for you in every case/context. While hyped, the "AI" possibilities are rather mediocre nowadays.

Indeed, you can use your lex, complementing such CAT/MT engines as PROMT with your TMs, but no program can properly mimic one's preferences or peculiarities--only guessing by trials and errors. It's more reasonable to prefer specific rule-based MTs, propagating your TM fragments.

Every client/audience requires a certain style. Furthermore, most translation equivalents have no unique solutions whereas the machine wants it clear--or uses the first 'best match', which may not always be the best for readers.


Therefore, you could traditionally couple a CAT and MT so it would find the best match or reuse your* edited variants.
Collapse


Shatlyk Penayev
 

Mark Bossanyi  Identity Verified
Bulgaria
Local time: 02:53
Member (2008)
French to English
+ ...
TOPIC STARTER
Thank you DZiW, Jun 12

Yes, I would not expect any system, neural or otherwise, to always get it exactly how I want it. I have used neural systems without any input from my TMs and quite often found the results very good. But being a bit of a perfectionist in terms of style and readability, I find myself spending too much time on post-editing, changing the order of clauses, etc., which can sometimes even makes it take almost as long as translating the text from scratch.

But I gather from your reply that
... See more
Yes, I would not expect any system, neural or otherwise, to always get it exactly how I want it. I have used neural systems without any input from my TMs and quite often found the results very good. But being a bit of a perfectionist in terms of style and readability, I find myself spending too much time on post-editing, changing the order of clauses, etc., which can sometimes even makes it take almost as long as translating the text from scratch.

But I gather from your reply that neural systems would be less responsive to my translation memories than rule-based machine translation systems. Do you think SDL Language Cloud could be a suitable option, for example? And importantly, I am wondering whether systems such as STL Language Cloud and PROMT would be able to sequester my input for confidentiality reasons. Do you have any information on this?
Collapse


 

Jean Dimitriadis  Identity Verified
France
Local time: 01:53
Member
English to French
+ ...
ModernMT Jun 12

There's ModernMT ( https://www.modernmt.eu/ ), but the cloud edition is expensive (€4 per thousand words).

I see it does EN>FR and FR>EN, but only EN>BG, not BG>EN.

Can be used directly in Matecat, maybe other CAT tools as well.

There's also a free and open source edition in case you want to get your hands dirty: See more
There's ModernMT ( https://www.modernmt.eu/ ), but the cloud edition is expensive (€4 per thousand words).

I see it does EN>FR and FR>EN, but only EN>BG, not BG>EN.

Can be used directly in Matecat, maybe other CAT tools as well.

There's also a free and open source edition in case you want to get your hands dirty: https://github.com/modernmt/modernmt

[Edited at 2019-06-12 10:42 GMT]
Collapse


 

Mark Bossanyi  Identity Verified
Bulgaria
Local time: 02:53
Member (2008)
French to English
+ ...
TOPIC STARTER
Thank you Jean Dimitriadis, Jun 12

I will look into these options.

For ModernMT, is the €4 that you mention per 1000 words of translation output or per 1000 words of TM input?


 

Jean Dimitriadis  Identity Verified
France
Local time: 01:53
Member
English to French
+ ...
Cost Jun 12

There is no cost for uploading your memories, only for using their SaaS. It's €4 for 1,000 words of MT output.

From their website: State-of-the-art neural machine translation as a service that learns from your translation memories and corrections.

Edited to add the following:

If you try Matecat, there is a big caveat.

Please note that by default, the segments are saved in a public TM (MyMemory).

I'm pretty sure you want to avoid
... See more
There is no cost for uploading your memories, only for using their SaaS. It's €4 for 1,000 words of MT output.

From their website: State-of-the-art neural machine translation as a service that learns from your translation memories and corrections.

Edited to add the following:

If you try Matecat, there is a big caveat.

Please note that by default, the segments are saved in a public TM (MyMemory).

I'm pretty sure you want to avoid that, so here's what to do:

-Create a private TM resource: In the Project creation page, click on Settings (Alternatively, in the TM and glossary field, expand the drop-down menu and select Create resource).
- Click on + New resource button in the opened dialog. Give the TM an optional name. Hit Confirm. You will see that “MyMemory: Collaborative translation memory” resource is Enabled for Lookup, but not set to be Updated anymore. That way, translated segments will only be stored in your private resources. You need to do that systematically for each new project.

After that, you are good to go.

[Edited at 2019-06-12 10:53 GMT]
Collapse


 

Mark Bossanyi  Identity Verified
Bulgaria
Local time: 02:53
Member (2008)
French to English
+ ...
TOPIC STARTER
Thanks again Jean, Jun 12

for your very helpful reply.

 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Personalising machine translation by feeding in translation memories or bilingual files?

Advanced search







SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search