Pages in topic:   [1 2] >
Using Reverse Machine translation to improve translation quality
Thread poster: Jing Nie

Jing Nie
China
Local time: 02:00
Member (2011)
English to Chinese
+ ...
Dec 18, 2015

I often use google translator toolkit or Microsoft's Translation tool to give me suggestions when translating. The results of machine translation are really helpful indeed.

About one year ago, I got a difficult large medical translation project. I find that only machine translation is not enough, I had to search online many many times for some terms. Even though, I felt that I am still not sure about some terms.

Then I tried to solve the problem.
I collected many related articles in target language. Then I put these articles together into one file, about 300,000 words in total, then used a CAT tool to segment sentences.
Then I uploaded these segmented sentences and used google translator toolkit to machine-translate these segmented sentences into the Source language.
I made a TMX file using the results of Reverse Machine translation and then import this TMX into the memory of my CAT tool.

I found that this new TM is very useful in searching the translation of certain source terms since the translation in the TM is written by native speakers, I can assure that the most search results are accurate.

I have used this Reverse Machine translation method to make large TM for big projects for a long time now. Up to now, I think it is a very good way to improve translation quality and it saved me a lot of time.






[修改时间: 2015-12-18 15:23 GMT]


 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 01:00
Member (2004)
English to Thai
+ ...
Certainly Dec 18, 2015

Jing Nie wrote:

I found that this new TM is very useful in searching the translation of certain source terms since the translation in the TM is written by native speakers, I can assure that the most search results are accurate.

I have used this Reverse Machine translation method to make large TM for big projects for a long time now. Up to now, I think it is a very good way to improve translation quality and saved me a lot of time.



Your algorithm is certainly effective to master new translation contexts. I usually do similarly but mostly to reduce my labor load e.g. time to key large amounts of target texts.
MT also suggests words I never imagined before e.g. synonyms and spelling with different meanings.
I believe, as reported by many surveyors, that MT is going to command advanced technical translations in a near future.

Soonthon L.


 

LegalTransform  Identity Verified
United States
Local time: 14:00
Member (2002)
Spanish to English
+ ...
Isn't this just Linguee? Dec 18, 2015

http://www.linguee.com/english-chinese/translation/pulmonary%20embolism.html

 

Henry Dotterer
Local time: 14:00
SITE FOUNDER
Thanks, Jing Dec 18, 2015

Jing Nie wrote:

> I collected many related articles in target language.
> Then I put these articles together into one file, about 300,000 words in total, then used a CAT tool to segment sentences.
> Then I uploaded these segmented sentences and used google translator toolkit to machine-translate these segmented sentences into the Source language.
> I made a TMX file using the results of Reverse Machine translation and then import this TMX into the memory of my CAT tool.

I found that this new TM is very useful in searching the translation of certain source terms since the translation in the TM is written by native speakers, I can assure that the most search results are accurate.

Interesting technique, thanks for posting. Which tool do you use? Which feature of that tool? I assume you do not get that many TU-level matches - you just use it for search, right?

If you do the same search at Linguee, how do the results compare?


 

Michael Wetzel  Identity Verified
Germany
Local time: 20:00
German to English
vs. Linguee Dec 18, 2015

My understanding has always been that Linguee does precisely the opposite = it looks for bilingual sites containing the search terms or phrases and then displays both languages of the site side-by-side. That is to say, Linguee does the opposite of what we should actually be doing = randomly sifts through the heaps of garbage translations on the Internet and places them in neat columns for us. I would guess it is probably better going from English into a foreign language, because then it would be doing something more similar to what Jing is describing, but in general I've never found much use for the site.

I'm intrigued by Jing's idea = he is working with "parallel texts" in the proper, narrow sense of the term (= texts originally written in English of a similar type or subject to what we are translating and NOT a synonym for previously translated texts). That means that Jing's system may pair terms with the wrong foreign terms, but they are real terms in English and if we actually come across one of the foreign terms in a genuine foreign source text, then that makes an incorrect pairing much less likely.


 

Jing Nie
China
Local time: 02:00
Member (2011)
English to Chinese
+ ...
TOPIC STARTER
Using MemoQ or WORDFAST PRO or TRADOS Dec 19, 2015

Henry Dotterer wrote:

Interesting technique, thanks for posting. Which tool do you use? Which feature of that tool? I assume you do not get that many TU-level matches - you just use it for search, right?

If you do the same search at Linguee, how do the results compare?


As for the CAT tool that can prepare segments, I prefer MEMOQ, it can combine many documents into one file using its "view" function , then export a Two-column RTF file. you can just upload the left column to Google translator toolkit and get the machine translation results. Then you can make your own two two-column file, then convert that 2 column file into TMX. In MemoQ, it is very simple, it can import a CSV file directly.
I think WORDFAST pro or Trados can also handle this task.

As for Linguee, it is useless for me, since 95% of my search do not get results.


 

Dan Lucas  Identity Verified
United Kingdom
Local time: 19:00
Member (2014)
Japanese to English
Some kind of document repository Dec 19, 2015

Jing Nie wrote:
As for the CAT tool that can prepare segments, I prefer MEMOQ, it can combine many documents into one file using its "view" function

Stepping back to the document collection phase, how did you designate 30k documents? I can understand that once you have the URLs it should be straightforward to scrape the content, but where does one get so many documents?

Regards
Dan


 

Jing Nie
China
Local time: 02:00
Member (2011)
English to Chinese
+ ...
TOPIC STARTER
Simpler than you expected. Dec 19, 2015

Dan Lucas wrote:


Stepping back to the document collection phase, how did you designate 30k documents? I can understand that once you have the URLs it should be straightforward to scrape the content, but where does one get so many documents?

Regards
Dan


A good question.For example, if I want to translate a medical document about a disease into English, I will search the English keywords mentioned in the article, such as the name of that disease, the treatments, the test methods. I will prefer add filetype:PDF in the google search, so I will get a lot of PDF files. I do not mean 30K documents, I mean the total word count is about 300,000 words.

[修改时间: 2015-12-19 09:41 GMT]


 

The Misha
Local time: 14:00
Russian to English
+ ...
This sounds like a fairly ingenious technique indeed Dec 19, 2015

However, I would venture a guess that in real life most of us prefer solving similar problems by improving our language skills, learning more of the subject area or simply refusing to handle jobs in areas that we do not know enough about. Not to be a critic, but what was that definition of a computer again, as a complicated electronic device that uses this and that, and hours of time to accomplish something that could be easily enough done with paper and pencil?

 

Miguel Carmona  Identity Verified
United States
Local time: 11:00
English to Spanish
... Dec 19, 2015

Jing Nie wrote:

I uploaded these segmented sentences and used google translator toolkit to machine-translate these segmented sentences into the Source language.

I made a TMX file using the results of Reverse Machine translation and then import this TMX into the memory of my CAT tool.


Do you verify the correctness of every segment machine translated into the source language?

Given the unguaranteed quality of Machine Translation output, this first QA step would seemingly be necessary to make sure that the TM created that way is accurate, not (God forbid) dangerously wrong.

Such verification process would be a daunting task in itself given not only the sheer amount of segments, but also the complexity of the text involved.


 

Jing Nie
China
Local time: 02:00
Member (2011)
English to Chinese
+ ...
TOPIC STARTER
The value of reverse machine translation. Dec 20, 2015

Miguel Carmona wrote:

Do you verify the correctness of every segment machine translated into the source language?

Given the unguaranteed quality of Machine Translation output, this first QA step would seemingly be necessary to make sure that the TM created that way is accurate, not (God forbid) dangerously wrong.

Such verification process would be a daunting task in itself given not only the sheer amount of segments, but also the complexity of the text involved.


Thanks for your suggestion, I verify the correctness of each term online. So I can assure the correctness.

The target language of this machine created TM is written by native speakers with corresponding work experiences. Their choice of words is much more accurate than from a translation. Sometimes one term have several corresponding translations in the target language, but I am not sure which one is the best, I have to search online many times in the past. With the help of reverse machine translation TM, once I confirm the reverse machine translation is correct, I need not to think about other alternative translations.

Sometimes, I just can not find the translation of one term online, but this TM can give me the clues and I can find the correct translation much easier.

[修改时间: 2015-12-20 01:08 GMT]


 

Steven Segaert
Estonia
Local time: 21:00
Member (2012)
English to Dutch
+ ...
Subject know-how? Dec 20, 2015

"I verify the correctness of each term online. So I can assure the correctness."

Well ... that's only true of you know quite a bit about the subject matter. Otherwise, you're only increasing the odds of getting a term right through clever(er) machine translation.

I often take on legal translations. Very often, I get to correct "translations" where the dictionary-translation of the word is correct, but where the use of the term doesn't makes sense given the context of the document. After a few dozen of such frustrating experiences, I now no longer offer proofreading, unless I know who the translator is.

And I would imagine medical is much more sensitive than legal.


 

Roni_S  Identity Verified
Slovakia
Local time: 20:00
Slovak to English
Great food for thought Dec 20, 2015

I like your idea, and using texts written by native speakers is the way to go. However, and I think this is key, native speaker-written texts must always be used because as Steven says, sometimes the bilingual dictionary meaning of a word is right but its use is wrong. I once found on a menu that my food would be garnished with 'firmament', which is one of the available translations of the word that comes from my source language.
Now that's twice I've used a food reference. I must be hungry!
Roni


 

Jing Nie
China
Local time: 02:00
Member (2011)
English to Chinese
+ ...
TOPIC STARTER
Do the advantages of this method outweigh the disadvantages? Dec 20, 2015

Steven Segaert wrote:

"I verify the correctness of each term online. So I can assure the correctness."

Well ... that's only true of you know quite a bit about the subject matter. Otherwise, you're only increasing the odds of getting a term right through clever(er) machine translation.

I often take on legal translations. Very often, I get to correct "translations" where the dictionary-translation of the word is correct, but where the use of the term doesn't makes sense given the context of the document. After a few dozen of such frustrating experiences, I now no longer offer proofreading, unless I know who the translator is.

And I would imagine medical is much more sensitive than legal.


I understand what you mean, of course I can not guarantee 100% correctness of the search results in the created TM.

My idea is: Do the advantages of this method outweigh the disadvantages? According to my experiences, the answer is Yes.


 

Miguel Carmona  Identity Verified
United States
Local time: 11:00
English to Spanish
... Dec 20, 2015

Jing Nie wrote:

Miguel Carmona wrote:

Do you verify the correctness of every segment machine translated into the source language?


I verify the correctness of each term online. So I can assure the correctness.

The target language of this machine created TM is written by native speakers with corresponding work experiences.


But at one point, in your reverse translation system, the problem is verifying the machine created source text as a translation of the target text.

The source text has to be a correct translation of the target text for the TM to be correct.

Sorry for repeating myself but, do you check the correctness of the source text as a translation of the target text?

[Edited at 2015-12-20 19:14 GMT]


 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Using Reverse Machine translation to improve translation quality

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search