Pages in topic:   < [1 2]
Machine translation not working lately
Thread poster: elm0505

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 13:19
Member (2005)
English to Spanish
+ ...
Non-applicable quoting will take you anywhere Oct 18, 2011

Samuel Murray wrote:
If you quote only the parts of it that appear to prove your argument, and neglect to quote the parts that clearly disprove it, then yes, you can make people believe anything. For a more complete picture, read this: http://www.proz.com/post/1803412#1803412

I think the terms of service of Google Translator Toolkit do not apply to the original poster's situation. But let's check:

TO THE ORIGINAL POSTER: When you used Google Translate from OmegaT, did you have a Google Account linked to the service?

If you use Google Translate on its own without an account, or via a derived service like Nicetranslator.com, or if your CAT tool's Google Translate plugin does not require any Google Account, Google Translator Toolkit's terms of service do not apply either.

In my opinion, it is a bit absurd to believe that Google Translate is safe because Google Translator Toolkit's TOS include a provision that only your unshared memories are private. A majority of translators do not have a Google Account in the first place, and the specific TOS only apply if you use Google Translator Toolkit itself, and not a plugin in a CAT tool, as the majority of people would use.

This is Kilgrays's warning about the use of their Google Translate plugin:
A word of caution: use this with care. If you download and enable the plugin, your segments will be going to the cloud - and that's not our cloud. Always make sure no confidential segments are sent to Google's service - this may mean a breach of confidentiality with your client.

Quite explicit, I reckon.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:19
Member (2006)
English to Afrikaans
+ ...
SITE LOCALIZER
Aligned GT translation Oct 18, 2011

Lutz Molderings wrote:
Samuel Murray wrote:
In the mean time, why not extract your source text, have it translated by GTT's web site system, then align it with your extracted text, and create a TM that you drop into the /tm/ folder?

Yep, that's what I've been doing over the last couple of weeks since GT hasn't been working properly any more. The whole process only takes a few minutes and it's a nice workaround if GT is of any use to you.


@elm: For an OmegaT user to do the aligned-GT thing, it would be easiest (I think) to create a temporary OmegaT project and load the files in it, and then copy all the segments (from the Edit pane or from the Find dialog, if you search for "*" with regex) to two text files, of which the one is translated by GT. This can then easily be aligned and it will contain the OmegaT formatting tags in more or less the right places.

I have found that GT screws up the formatting tags (adds spaces where no spaces should be), so I wrote a find/replace macro that first changes space+TAG to space+underscore+TAG before I put the text through GT, and then I do another find/replace macro on the output that uses the presence of the underscores to know where the spaces should bechanges it back, so that those formatting tags that should have spaces have spaces, and those that should have no spaces have no spaces.

GT also adds invisible invalid characters to the translation, which I have written a macro for removing again. The MS Word character code for those characters is ChrW(8203) (not sure what it would be in your preferred scripting language).

@Lutz: You're a Trados 2009 user, aren't you Lutz? Does Trados 2009 allow for easy penalising of TUs (segments) based on creation ID or similar attributes?

@elm again: OmegaT can't do penalising of TUs het, so if one wants to penalise TUs one has to create a second TM and add an extra rubbish character to the source text (e.g. a single "! " before the rest of the text should help drop the match percentage by 5-15% for medium length segments). Adding that same character to the target text should help prevent one from accidentally validating a non-edited GT TU. In an OmegaT TM (opened in a Unicode text editor), that is as simple as replacing "<seg>" with "<seg>! ".


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:19
Member (2006)
English to Afrikaans
+ ...
SITE LOCALIZER
Clarification Oct 18, 2011

Tomás Cano Binder, CT wrote:
I think the terms of service of Google Translator Toolkit do not apply to the original poster's situation.


No it doesn't, you're right. But some of the replies do (those that related to what one might do if the CAT tool's GT function no longer works).

In my opinion, it is a bit absurd to believe that Google Translate is safe because Google Translator Toolkit's TOS include a provision that only your unshared memories are private.


Yes, one has to have one's eyes open -- I don't know that the TOS's of the GT API that CAT tools use say, but I do know what GTT's says, so I know that using GTT is safe, but I have no idea if using GT via a CAT tool is safe.

I did not realise that your comment related specifically to non-GTT use of GT -- it sounded to me like you were saying that *all* use of GT (in any of its forms) is unsafe. A good way of convincing people that the public GT is unsafe would in fact be to compare it to GTT and to say that the public GT is *not* protected by those extra terms of service.

A majority of translators do not have a Google Account in the first place, and the specific TOS only apply if you use Google Translator Toolkit itself, and not a plugin in a CAT tool, as the majority of people would use.


I agree, and that is why I think one should make this clear when talking about the topic, i.e. not say "GT is bad" but rather "GT is not as safe as GTT" or even "GT is worse than GTT".

This is Kilgrays's warning about the use of their Google Translate plugin:
Always make sure no confidential segments are sent to Google's service - this may mean a breach of confidentiality with your client.


I think this just goes to show that if GT should offer the possibility for translators to use their own GTT accounts (or a similar feature whereby each translator logs in to GT with a unique user ID), then the CAT tool vendors should offer this possibility, because then translators won't be sending their segments into a crowd-based public cloud but into a private cloud shielded from prying eyes.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 13:19
Member (2005)
English to Spanish
+ ...
Most people don't know that GTT exists! Oct 18, 2011

Samuel Murray wrote:
Yes, one has to have one's eyes open -- I don't know that the TOS's of the GT API that CAT tools use say, but I do know what GTT's says, so I know that using GTT is safe, but I have no idea if using GT via a CAT tool is safe.

I did not realise that your comment related specifically to non-GTT use of GT -- it sounded to me like you were saying that *all* use of GT (in any of its forms) is unsafe. A good way of convincing people that the public GT is unsafe would in fact be to compare it to GTT and to say that the public GT is *not* protected by those extra terms of service.

Correct.

Now, do you think most translators using Google Translate today are using Google Translate Toolkit? I bet they are not. They will be using their CAT tool's plugin for Google Translate, and such plugins do not at all relate to GTT's terms of service.

Also, you mention the Google Translate API's terms of service and wonder whether they protect your stuff. Well, my friend, whatever those TOS may say, if the API only sends the segment for translation and no Google Account/GTT resources are linked to it (at least in the case of memoQ, no Google Account is required), how the hell is your stuff going to be protected? It is simply uploaded to Google's servers and they may use it all as they wish, according to Google's general terms of service.

So my warning about using Google Translate is fully valid.

[Edited at 2011-10-18 06:23 GMT]


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:19
Member (2006)
English to Afrikaans
+ ...
SITE LOCALIZER
GTAPIv2 vs GTAPv1 Oct 18, 2011

Tomás Cano Binder, CT wrote:
Also, you mention the Google Translate API's terms of service and wonder whether they protect your stuff.


I googled a bit and found it. GT API v1 seems to have used the general privacy policy by which data was not kept as confidential as one might have wished. GT API v2 is more explicit in its terms of use (see section 5):

Submission of Content and Data Confidentiality.
* Google does not claim any ownership in any of the content ... that you ... transmit in the API.
* ...you give Google a ... license to ... use [the transmitted] content ... for the sole purpose of providing you with the API and to ensure the functioning of Google products and services.
* We will not share the content you upload with any other third party.
* In addition, we provide SSL connection for secure connectivity to the API.


Of course, granting a third party (such as Google) a license to translate the file might still contravene a confidentiality agreement, unless the agreement mentions something about parties to whom you subcontract work and in what ways they are bound by the agreement.

...if the API only sends the segment for translation and no Google Account/GTT resources are linked to it (at least in the case of memoQ, no Google Account is required)...


In the case of MemoQ (and in the case of v1) MemoQ's developers' API user account is linked to it. I don't quite understand v2 on whether individual users will be able to have individual accounts.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 13:19
Member (2005)
English to Spanish
+ ...
Will Google sign a confidentiality agreement with you? Oct 18, 2011

Samuel Murray wrote:
Of course, granting a third party (such as Google) a license to translate the file might still contravene a confidentiality agreement, unless the agreement mentions something about parties to whom you subcontract work and in what ways they are bound by the agreement.

You cannot bind someone who does not sign the agreement. Yes, your agreement with a translation customer may say that you are obliged to have your subcontractors sign confidentiality agreements similar to the one you have with the customer, but I reckon nobody in his good sense would try to send Google a confidentiality agreement before using Google Translate. Google would of course reject such a request, and hence any use of Google Translate outside of Google Translate Toolkit would be a breach of your confidentiality agreement with the customer.


 

msafiri  Identity Verified
United Kingdom
Local time: 12:19
French to English
+ ...
Interesting, but how to make OmegaT work with MT now? Oct 18, 2011

The discussion is very interesting in the issues it raises, but to get back to the original point for those who need to get on with the job of using OmegaT to translate using the help of MT:

- how, in practice, can we get OmegaT working again in the the short term?

The other MT options - Apertium and Belazar - don't seem to work either. Do they require membership formalities before they will work in OmegaT?


 

elm0505
Spain
Local time: 13:19
French to Spanish
+ ...
TOPIC STARTER
No prob for me Oct 18, 2011

Tomás Cano Binder, CT wrote:

Samuel Murray wrote:
If you quote only the parts of it that appear to prove your argument, and neglect to quote the parts that clearly disprove it, then yes, you can make people believe anything. For a more complete picture, read this: http://www.proz.com/post/1803412#1803412

I think the terms of service of Google Translator Toolkit do not apply to the original poster's situation. But let's check:

TO THE ORIGINAL POSTER: When you used Google Translate from OmegaT, did you have a Google Account linked to the service?

If you use Google Translate on its own without an account, or via a derived service like Nicetranslator.com, or if your CAT tool's Google Translate plugin does not require any Google Account, Google Translator Toolkit's terms of service do not apply either.

In my opinion, it is a bit absurd to believe that Google Translate is safe because Google Translator Toolkit's TOS include a provision that only your unshared memories are private. A majority of translators do not have a Google Account in the first place, and the specific TOS only apply if you use Google Translator Toolkit itself, and not a plugin in a CAT tool, as the majority of people would use.

This is Kilgrays's warning about the use of their Google Translate plugin:
A word of caution: use this with care. If you download and enable the plugin, your segments will be going to the cloud - and that's not our cloud. Always make sure no confidential segments are sent to Google's service - this may mean a breach of confidentiality with your client.

Quite explicit, I reckon.


Yes, I do have a Google Account, but in my particular case I don't worry too much about the privacy issue because my translations are not "serious documents", they're simple descriptions that are immediately uploaded to the company I work at's website, and not only that, those texts are also used by other similar companies that import contents from our website. Not to mention that my source texts are sometimes copied from other websites whose products we are selling.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 13:19
Member (2005)
English to Spanish
+ ...
OK! Oct 18, 2011

elm0505 wrote:
Yes, I do have a Google Account, but in my particular case I don't worry too much about the privacy issue because my translations are not "serious documents", they're simple descriptions that are immediately uploaded to the company I work at's website, and not only that, those texts are also used by other similar companies that import contents from our website. Not to mention that my source texts are sometimes copied from other websites whose products we are selling.

OK! Now, is your employer aware that their source text gets uploaded to Google and that Google may use them at their discretion and for any purposes?

In the future, if you do freelance work, please remember that there is no such thing as a minor violation of a privacy agreement: if you disclose one sentence, it is the same as if you disclosed 20 pages, the same way stealing an apple is theft the same as if you stole a ton of apples.

[Edited at 2011-10-18 10:07 GMT]


 

Didier Briel  Identity Verified
France
Local time: 13:19
English to French
+ ...
Create a TM Oct 18, 2011

msafiri wrote:
- how, in practice, can we get OmegaT working again in the the short term?

(OmegaT is working. That's Google Translate which is not.)
Samuel gave you a workaround. Create a TM with Google Translate (Toolkit) and use it in OmegaT.

The other MT options - Apertium and Belazar - don't seem to work either.
Do they require membership formalities before they will work in OmegaT?

Apertium works fine, and requires no formality. Perhaps your language pair is not supported by Apertium:
http://www.apertium.org/

Belazar requires a server, and works strictly between Belarussian and Russian, so I doubt you have much use for it:
http://belsoft.of.by/belazar/

Didier


 

Jaroslaw Michalak  Identity Verified
Poland
Local time: 13:19
Member (2004)
English to Polish
What is public cannot be disclosed Oct 18, 2011

Tomás Cano Binder, CT wrote:
OK! Now, is your employer aware that their source text gets uploaded to Google and that Google may use them at their discretion and for any purposes?

In the future, if you do freelance work, please remember that there is no such thing as a minor violation of a privacy agreement: if you disclose one sentence, it is the same as if you disclosed 20 pages, the same way stealing an apple is theft the same as if you stole a ton of apples.


It might depend on the jurisdiction, but as far as I know if a text is publicly available, then it cannot be "disclosed" as far as privacy agreements are concerned. And if it on a website, then most likely it will be downloaded and "used" by Google any way (if only for indexing purposes).

Whether the client objects, of course, is another matter.


 

Tomás Cano Binder, BA, CT  Identity Verified
Spain
Local time: 13:19
Member (2005)
English to Spanish
+ ...
Of course, but... Oct 18, 2011

Jabberwock wrote:
Tomás Cano Binder, CT wrote:
OK! Now, is your employer aware that their source text gets uploaded to Google and that Google may use them at their discretion and for any purposes?

It might depend on the jurisdiction, but as far as I know if a text is publicly available, then it cannot be "disclosed" as far as privacy agreements are concerned. And if it on a website, then most likely it will be downloaded and "used" by Google any way (if only for indexing purposes).

Whether the client objects, of course, is another matter.

Of course you are right. The question here is that the text probably gets disclosed before the customer intends to make it available to the public.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:19
Member (2006)
English to Afrikaans
+ ...
SITE LOCALIZER
A quick (temporary) video Oct 18, 2011

msafiri wrote:
- how, in practice, can we get OmegaT working again in the the short term?


http://www.youtube.com/watch?v=dGgmgzQRrnw

(very quickly done, no voice)


 
Pages in topic:   < [1 2]


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


Machine translation not working lately

Advanced search






WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search