Google Translate and confidentiality
Thread poster: Samuel Murray

Samuel Murray  Identity Verified
Netherlands
Local time: 18:06
Member (2006)
English to Afrikaans
+ ...
Sep 10, 2009

G'day everyone

Some translators are concerned with confidentiality when using Google Translate, but is there any specific information anywhere on the web about whether Google stores the stuff you upload and/or make it available to the public at some point? My own logic tells me that Google has nothing to gain from harvesting its own MT output, and there is no shortage of typical source texts on the web, so why would Google want to store it and/or reuse it? Isn't the whole point of statistical MT that it improves when fed with real translation? What benefit could there possibly be for Google Translate to feed itself its own translations, then?

Thanks
Samuel


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 18:06
French to Polish
+ ...
Data mining... adding better transltions... Sep 10, 2009

Samuel Murray wrote:

Some translators are concerned with confidentiality when using Google Translate, but is there any specific information anywhere on the web about whether Google stores the stuff you upload and/or make it available to the public at some point?
My own logic tells me that Google has nothing to gain from harvesting its own MT output,

IMHO you're right.

and there is no shortage of typical source texts on the web, so why would Google want to store it and/or reuse it?

The source may be stored for data mining.
The text you send to translate may contain valuable informations.

Isn't the whole point of statistical MT that it improves when fed with real translation? What benefit could there possibly be for Google Translate to feed itself its own translations, then?

None.
But you may click the button "Contribute a better transltion" (available in the standard Google Translate interface) and and improve the quality.
Of course, if you want to do it

Cheers
GG


Direct link Reply with quote
 

Quamrul Islam  Identity Verified
Local time: 22:06
Member (2009)
English to Bengali
+ ...
I agree with you. Sep 10, 2009

Hello Samuel,
I completely agree with your arguments that Google should have no interest in collecting arbitrary source texts along with its own machine translation. However, I have a question to all - it is at least possible that Google collects only the source texts (and not the MT output) to enrich its database of source terms in any particular language?

Thanks
Quamrul


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 13:06
English to Spanish
Source text is the key Sep 10, 2009

Hi,

As a CAT tool developer I can tell you that the translation is not important, the source text is invaluable.

We need source documents to improve parsers, segmenters and many other details. If you provide fragments of a document you are still contributing a lot.

If you signed an NDA with your client, you cannot provide the source text to anyone. Don't use a web-based Machine Translation engine unless you are the owner of the engine.

Regards,
Rodolfo


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:06
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
At Rodolfo Sep 10, 2009

Rodolfo Raya wrote:
We need source documents to improve parsers, segmenters and many other details. If you provide fragments of a document you are still contributing a lot. ... If you signed an NDA with your client, you cannot provide the source text to anyone.


Not all NDAs are the same, so one can't really say "if NDA then no Google Translate". Whether it is breach of confidentiality to upload content to an automated online text processing service is for every translator to determine. The question about whether it is a breach of NDA has been discussed many times.

My question here is whether anyone knows for certain if Google will or will not (at some stage) make the submitted information public (in part or in whole). I mean, if my NDAs permit me to submit my source texts to a non-disseminating text processing service, then my only concern would be whether Google Translate is disseminating or non-disseminating.


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 18:06
French to Polish
+ ...
Google Terms of Service... Sep 10, 2009

Samuel Murray wrote:


My question here is whether anyone knows for certain if Google will or will not (at some stage) make the submitted information public (in part or in whole). I mean, if my NDAs permit me to submit my source texts to a non-disseminating text processing service, then my only concern would be whether Google Translate is disseminating or non-disseminating. [/quote]

Quoting Google:

11. Content licence from you

11.1 You retain copyright and any other rights you already hold in Content which you submit, post or display on or through, the Services. By submitting, posting or displaying the content you give Google a perpetual, irrevocable, worldwide, royalty-free, and non-exclusive licence to reproduce, adapt, modify, translate, publish, publicly perform, publicly display and distribute any Content which you submit, post or display on or through, the Services. This licence is for the sole purpose of enabling Google to display, distribute and promote the Services and may be revoked for certain Services as defined in the Additional Terms of those Services.

11.2 You agree that this licence includes a right for Google to make such Content available to other companies, organisations or individuals with whom Google has relationships for the provision of syndicated services, and to use such Content in connection with the provision of those services.

11.3 You understand that Google, in performing the required technical steps to provide the Services to our users, may (a) transmit or distribute your Content over various public networks and in various media; and (b) make such changes to your Content as are necessary to conform and adapt that Content to the technical requirements of connecting networks, devices, services or media. You agree that this licence shall permit Google to take these actions.

11.4 You confirm and warrant to Google that you have all the rights, power and authority necessary to grant the above licence.


Adopt the worse scenario.

Cheers
GG


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:06
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
At Grzegorz Sep 10, 2009

Grzegorz Gryc wrote:
Quoting Google:
11. Content licence from you...
Adopt the worse scenario.


The quote comes from the agreement that you sign with Google when you open a Google account. The Google Translate service is not available only to people who have Google accounts.

But you raise an interesting point -- doesn't this mean that any translator with a Gmail account is in breach of his NDAs?


Direct link Reply with quote
 

Derek Gill Franßen  Identity Verified
Germany
Local time: 18:06
German to English
+ ...
Just a side-note... Sep 10, 2009

Samuel Murray wrote:
own logic tells me that Google has nothing to gain from harvesting its own MT output, and there is no shortage of typical source texts on the web, so why would Google want to store it and/or reuse it? [...][/quote]

Sorry for the sidetrack, but this discussion reminded me of a podcast I recently heard on NPR (National Public Radio) titled "Who Really Owns Your Digital Data?"

Here is a quote from that podcast: "The other thing is that disk storage is incredibly cheap. So there's actually no reason anymore to delete any data. We don't really have reason to delete data at home. We can buy a terabyte disk for $100, and similarly, Google and Yahoo and Microsoft have no need to delete any data." (See http://www.npr.org/templates/story/story.php?storyId=111421072 .)


Direct link Reply with quote
 

Piotr Bienkowski  Identity Verified
Poland
Local time: 18:06
Member (2005)
English to Polish
+ ...
The translator takes the responsibility Sep 10, 2009

Rodolfo's Swordfish allows you to upload only one sentence at a time to Google Translate through the GT plugin.

So the translator takes the responsibility whether to upload a particular sentence/segment or not.

Regards

Piotr



[Edited at 2009-09-10 15:00 GMT]


Direct link Reply with quote
 
Adam Łobatiuk  Identity Verified
Poland
Local time: 18:06
Member (2009)
English to Polish
+ ...
Doesn't need to be just statistical Sep 10, 2009

Samuel Murray wrote:
Isn't the whole point of statistical MT that it improves when fed with real translation? What benefit could there possibly be for Google Translate to feed itself its own translations, then?


The selling point of Google Translate is not being a 'statistical MT', but just an MT. It might combine statistical methods with submitted human translations that could be given a higher priority by the system. Users are not interested in the sources of a translation.

By the way, do Swordfish and Wordfast actually submit edited machine translations back to Google or just the source? I suspect that the MT in SDL Trados might submit both, and they are a translation company that could use them for their own purposes, but fortunately it doesn't even support my language yet.


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 13:06
English to Spanish
Swordfish sends source only Sep 10, 2009

Adam Łobatiuk wrote:
By the way, do Swordfish and Wordfast actually submit edited machine translations back to Google or just the source?


Swordfish sends only source text, without any formatting (all tags are removed before contacting Google's server).

As Piotr said, translators control when to send requests to Google from within Swordfish.

Regards,
Rodolfo


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Google Translate and confidentiality

Advanced search







PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search