Pages in topic:   [1 2] >
Using Google Translate to process long texts without buying an API access key
Thread poster: Emal Ghamsharick

Emal Ghamsharick  Identity Verified
Germany
Local time: 16:44
English to German
+ ...
Nov 11, 2014

Using Google Translate to process long texts without buying an API access key

Do you translate long texts with many random strings, e.g., for software interfaces and websites? Want to pre-process parts of it, so you can just proofread it (“machine translation post-editing”) instead of typing it all?

Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it. First I’ll explain the theory, then a specific example.

More: http://noordertranslation.wordpress.com/2014/11/11/3/


Direct link Reply with quote
 

Alex Lago  Identity Verified
Spain
Local time: 16:44
Member (2009)
English to Spanish
+ ...
You aren't using the right terms, your suggestion is not a hack, it is a workaround Nov 11, 2014

First of all you should know that hacking a program is illegal and advertising the fact that you hack programs could get you into trouble with the copyright owner and the authorities, so if I where you I would be careful advertising hacks.

Second of all what you have posted in your website is not in fact a hack but a workaround, you are not hacking into the Google API, you are showing people how to achieve the same result without using the API itself, that is called a workaround and is in fact legal, so no problem there.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:44
Member (2006)
English to Afrikaans
+ ...
Google Translate by alignment Nov 11, 2014

Emal Ghamsharick wrote:
Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it.


It looks as if you're explaining how to align a translation that was done on Google Translate. Yes, if you don't have a Google Translate API subscription (or if you find the API too cumbersome even if you have a subscription, as I have), you can create a TM by translating the source text with Google Translate and then aligning the translation to the source text.

In your case, you used Wordfast Anywhere to perform the segment extraction (not all CAT tools offer segment extraction, but I know that Wordfast Classic has it). You then used Excel to create the TM, but you can also use PlusTools to align a two-column table.

Finally, you seem to use the word "auto-propagate" to mean "propagate". If you have to copy/paste content into the columns manually, then it's not "auto" ... (-: ... but we know what you meant.

I find it odd that you use Wordfast Anywhere, if you assume that the user has Wordfast Pro, because Wordfast Pro's "PM Perspective" view contains both a segment extraction and a bilingual table creator, just like Wordfast Anywhere. I suppose the advantage of Wordfast Anywhere is that it works also for people who don't have Wordfast Pro.

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 15:44
Member (2009)
Dutch to English
+ ...
Come on, guys… Nov 11, 2014

Way too much trouble if you ask me. A subscription isn't exactly expensive.

Michael


Direct link Reply with quote
 

Salam Alrawi  Identity Verified
United States
Local time: 09:44
English to Arabic
+ ...
Removing sensitive info Nov 11, 2014

Samuel Murray wrote:

Emal Ghamsharick wrote:
Some computer-assisted translation (CAT) tools let you use Google’s Translate API. The service is not free. But you can hack it.


It looks as if you're explaining how to align a translation that was done on Google Translate. Yes, if you don't have a Google Translate API subscription (or if you find the API too cumbersome even if you have a subscription, as I have), you can create a TM by translating the source text with Google Translate and then aligning the translation to the source text.

In your case, you used Wordfast Anywhere to perform the segment extraction (not all CAT tools offer segment extraction, but I know that Wordfast Classic has it). You then used Excel to create the TM, but you can also use PlusTools to align a two-column table.

Finally, you seem to use the word "auto-propagate" to mean "propagate". If you have to copy/paste content into the columns manually, then it's not "auto" ... (-: ... but we know what you meant.

I find it odd that you use Wordfast Anywhere, if you assume that the user has Wordfast Pro, because Wordfast Pro's "PM Perspective" view contains both a segment extraction and a bilingual table creator, just like Wordfast Anywhere. I suppose the advantage of Wordfast Anywhere is that it works also for people who don't have Wordfast Pro.

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.



I really enjoyed your explanation. But I have one question for you if you don't mind answering. Why would you remove something (e.g. Sensitive information) from file bbb.txt? Will google pick it up or something?


Direct link Reply with quote
 

Wolfgang Jörissen  Identity Verified
Belize
Member
Dutch to German
+ ...
0,64 € Nov 11, 2014

That was my last bill for API usage. I am not a heavy user, but nevertheless, a couple of colleagues of mine use the same key. Even if I would have criminal energy within me (quod non), I see no reason for "hacking" that system.

Direct link Reply with quote
 

Jorge Payan  Identity Verified
Colombia
Local time: 09:44
Member (2002)
German to Spanish
+ ...
EUR 3,8 per month Nov 11, 2014

It is how much I pay Dallas Cao for using his Google Translate for Translators (GT4T) tool (http://dallascao.com/en/gt4t). It takes the text to be translated to both Bing Translator and Google Translate (and even to mymemory.translated.net) and inserts the translation back into any target segment field provided in the TenT environment, Word, Excel, the Clipboard, etc.

It saves me the time to be messing around with settings for each TenT I use, and being a Windows application works almost everywhere in the screen. Dallas is in charge of paying for the API usage to whoever it corresponds.

Frankly, I don't see the point of saving yourself such a meager amount per month. The investment I make is easily amortized after the first 15 minutes of paid work.

Saludos


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:44
Member (2006)
English to Afrikaans
+ ...
The API is slow, that's what Nov 11, 2014

Wolfgang Jörissen wrote:
I am not a heavy user, but nevertheless, a couple of colleagues of mine use the same key. Even if I would have criminal energy within me (quod non), I see no reason for "hacking" that system.


The API translates one segment at a time, which is a slow process, particularly with very short segments. Sometimes one wants more speed, and then using the API is not suitable.


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 15:44
Member (2009)
Dutch to English
+ ...
absolute insanity Nov 11, 2014

Samuel Murray wrote:

Here's how I usually do it:

I open the file in Wordfast and then do a segment extraction. I save the extracted segment as aaa.txt. I then resave aaa.txt as bbb.txt and perform some preprocessing on it (e.g. remove sensitive information). Then, I translate bbb.txt in Google Translate (using the program QTranslate) and save the translation as ccc.txt. I then save ccc.txt as ddd.txt and perform some post-processing on it (e.g. fix the spacing errors that Google introduces). Then I align aaa.txt with ddd.txt using PlusTools, and when I'm happy with the result (usually it's perfectly aligned), I generate the TM. Finally, I change the translation units' user ID to something that tells me it's a Google translation. I create bbb.txt and ccc.txt to enable me to roll back to a previous state if I discover that I should have done something a little different.



Wow, I just re-read your post. Why in god's name would you want to do all that before every job? What a complete waste of time. I just don't understand why you want to go through all those steps every time you start a job, just to end up with a crappy Google Translated TMX?! What about multi-file projects?

You say that the API is too slow, but how fast do you need it to be? I mean, don't you translate segment by segment, like everyone else? Or is it just to save money? You say "Sometimes one wants more speed", but why exactly? What is the benefit, other than getting something for free.

Also, I've been thinking about the various GT "facilitators" out there, and I have a sneaking suspicion they probably aren't paying Google anything. More likely they just coded a hack and invented a pricing schedule

"3. Both Google Translate API and Bing translator API are now paid services, but you don’t need to pay Google or Microsoft directly. As a GT4T customer, you pay me and I pay Google and Microsoft." (http://gt4t.net/en/purchase/ )

Hmm.

How can GT4T offer set annual rates? It just doesn't make sense. If it was legit, wouldn't you expect the software to just have an input field for the Google Translate API key like all the CAT tools?

Michael

[Edited at 2014-11-11 23:10 GMT]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:44
Member (2006)
English to Afrikaans
+ ...
@Michael Nov 12, 2014

Michael Beijer wrote:
I just re-read your post. Why ... would you want to do all that before every job?


I don't do that before every job (I never said so, either). I was simply describing my method of accomplishing what the original poster accomplishes with his method. As to "why"... well, as stated in more than one post in this thread, allow me to repeat: it is sometimes better to get Google Translations in bulk, instead of piece-meal.

What a complete waste of time. I just don't understand why you want to go through all those steps ... just to end up with a crappy Google Translated TMX?!


(Well, yes, if one ends up with a TMX file, it would certainly be a waste of time... because then you'd have to convert the TMX back to something useful again.)

CAT hopping always takes extra time. Whether it is wasted time depends on how much time is saved in the end.

What about multi-file projects?


Perhaps your CAT tool can't handle multifile segment extraction, but that does not mean no CAT tools can. Wordfast Pro can do it. Wordfast Classic used to be able to do it. OmegaT can do it. With a bit of tinkering, even Trados 2009+ can do it.

And besides, if you prefer to use a CAT tool that can extract only one file at a time, well... ever heard of "merge"?

You say that the API is too slow, but how fast do you need it to be?


The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.

I mean, don't you translate segment by segment, like everyone else?


Translating segment by segment does not mean one can't do preparations for more than one segment at a time. After all, when you receive a file with lots of formatting problems, don't you start off by fixing the entire file, before you start translating even the first segment?

There is nothing strange about creating reference materials before you start translating.

You say "Sometimes one wants more speed", but why exactly?


What an odd question.

Remember, the "more speed" that we're talking about in this thread relates less to overall translation speed than to responsiveness of the user interface. The API causes pauses. Using a TM instead of the API effectively gets rid of the pauses. Granted, for some people those pauses are not annoying (or hardly noticeable, if their software pauses between segments anyway).

I've been thinking about the various GT "facilitators" out there, and I have a sneaking suspicion they probably aren't paying Google anything.


Please keep suspicion-mongering to a separate thread where like-minded people can reply with either evidence or further suspicions. Besides, it has nothing to do with the current topic, as the original poster clearly uses Google's own service directly.


[Edited at 2014-11-12 09:26 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 15:44
Member (2009)
Dutch to English
+ ...
Translating lists of words … with Google Translate? Nov 12, 2014

Samuel Murray wrote:

You say that the API is too slow, but how fast do you need it to be?


The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.



Translating lists of words with Google Translate? Wow, whoever tries such a thing deserves your tortuous workflow

Michael


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:44
Member (2006)
English to Afrikaans
+ ...
Translating lists Nov 12, 2014

Michael Beijer wrote:
Samuel Murray wrote:
You say that the API is too slow, but how fast do you need it to be?

The original post was specifically about translating lists of words. The API causes pauses of up to a second between each segment. When translating lists, those pauses can be excruciating.

Translating lists of words with Google Translate? Wow, whoever tries such a thing deserves your tortuous workflow.


1. Google Translate is ideally suited to translating lists of words. How would you translate lists of words... with a dictionary next to the keyboard?

2. My workflow is completed in under a minute. If that is tortuous, then you really need to find another career.


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 15:44
Member (2009)
Dutch to English
+ ...
…with 29 dictionaries! Nov 12, 2014

Samuel Murray wrote:

1. Google Translate is ideally suited to translating lists of words. How would you translate lists of words... with a dictionary next to the keyboard?

Depends on the list of words.

I generally don't translate lists of words, and if I do, I get paid a lot more than my regular word rate, so yes, with a dictionary (or 15) on my desk, and another god knows how many on my computer, plus …
• Google Translate + Microsoft Translator in little boxes in CafeTran (via their respective APIs),
• my TMs and external databases inside CafeTran,
• my 40 million-TU TMLookup TMX database,
• my dtSearch/Copernic desktop search programs,
• IntelliWebSearch,
plus … plus … plus …
Samuel Murray wrote:
2. My workflow is completed in under a minute. If that is tortuous, then you really need to find another career.

Anything involving PlusTools is by definition tortuous. Isn't it from 1989?

Michael


Direct link Reply with quote
 

Milan Condak  Identity Verified
Local time: 16:44
English to Czech
Instant translation Nov 12, 2014

Samuel Murray wrote:

1. Google Translate is ideally suited to translating lists of words.

2. My workflow is completed in under a minute.


My workflow is completed in under 1-10 minutes.

I created a presentation "Instant translation" four years ago.

There are 8 pages in Czech:

http://www.condak.net/machine_t/cs/instant/cs/00.html

03.html = 2010: hunalign, 2014: I use LF Aligner 4.05

05.html = Okamžitý překlad / Instant translation

Video (7 sekund) s průběhem překladu (méně než 1 sekunda, 137 segmentů)
Video (7 seconds) with translation process (shorter than 1 sec., 137 segments)

I do not use this method anymore. For translation with Google Translate I use desktop application for Windows (as Samuel, but not QTranslate).

For bulk editing target TUs in TMX I use Virtaal (spaces, mistranslation,...) and for individual editing I receive offers from more engines:

http://www.condak.net/machine_t/qtranslate/20131101/cs/03.html

Edited TMX is directly used in OmegaT.

The using of dictionary created from extracted words and GT in OmegaT:

http://www.condak.net/cat_other/omegat/2013-11-24/cs/03.html

What is wrong, or more words phrase is manually given into glossary.

Happy PEMT in own hands.

Milan


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 16:44
Member (2006)
English to Afrikaans
+ ...
@Salam Nov 14, 2014

Salam Alrawi wrote:
Why would you remove something (e.g. Sensitive information) from file bbb.txt? Will Google pick it up or something?


There are those who believe that Google might do that, yes.

But I often just want to make sure the TM I create is sanitised, and it's easier to sanitise it before sending it to Google Translate than afterwards. For example, Google Translate might translate "London, Bristol, Manchester" as "London, Bristol, Manchester", but it might also translate it as "London, Bristol, Edinburgh" (because Google isn't intelligent, and yes, Google can translate such strings in such odd ways), and if I try to remove "Manchester" from the TM, I would be stuck with Edinburgh in the TM.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Using Google Translate to process long texts without buying an API access key

Advanced search






SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search