Mobile menu

Pages in topic:   [1 2] >
Plagiarism Scanner as translation ressource?
Thread poster: Noe Tessmann
Noe Tessmann  Identity Verified
Local time: 11:22
English to German
+ ...
May 13, 2008

Dear colleagues,

I always wondered if it is possible to use some software to detect plagiarism for translation purposes. Many of the authors copy parts of their texts from the internet and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.

It should work like any CAT tool looking for fuzzy matches. The best thing would be a rough online alignment, but this seems be too good to be true.


Does anyone use such software and if yes what are your experiences?


Thanks in advance

Noe


[Edited at 2008-05-13 13:38]


Direct link Reply with quote
 

Attila Piróth  Identity Verified
France
Local time: 11:22
Member
English to Hungarian
+ ...
Wordfast May 13, 2008

Noe Tessmann wrote:

The best thing would be a rough online alignment, but this seems be too good to be true.



Check out the Very Large Translation Memory (VLTM) project: http://www.wordfast.net/index.php?whichpage=jobs&lang=engb

Attila


Direct link Reply with quote
 

Jack Doughty  Identity Verified
United Kingdom
Local time: 10:22
Member (2000)
Russian to English
+ ...
Exam cheats May 13, 2008

I don't know the answer to this, but there were some stories in the UK press a few months ago about examiners using software to check exam papers, degree theses etc. for plagiarism.

Direct link Reply with quote
 
Noe Tessmann  Identity Verified
Local time: 11:22
English to German
+ ...
TOPIC STARTER
Inspiration May 13, 2008

Hello,

I think about situations where you realise at the end of your translation that large paragraphs already exist somewhere on the web, which at least could have inspired you.


@ Atila:

I think about about machine alignment not human alignment, asking for a TM would be too much.
A part of the EU website is now accessible through the DGT TM but the rest has to scanned by hand?

Regards

Noe


Direct link Reply with quote
 

Tony M  Identity Verified
France
Local time: 11:22
Member
French to English
+ ...
Google helps a lot! May 13, 2008

Hardly a systematic way of going about it, but I very often find that if I Google a salient chunk of text, I come up with the exact-same document that I'm working on, and once I've found that, it often leads on to other treasures...

Direct link Reply with quote
 
Noe Tessmann  Identity Verified
Local time: 11:22
English to German
+ ...
TOPIC STARTER
Help Google further May 13, 2008

Hello Tony,

yes but there is software, that examines every chunk of your text with Google and tells you if the author found some pieces in the internet. From your experience you could than mostly tell if there is a chance, that something exists in your language.



Noe

[Edited at 2008-05-13 17:14]


Direct link Reply with quote
 
xxxLia Fail  Identity Verified
Spain
Local time: 11:22
Spanish to English
+ ...
software May 13, 2008

Noe Tessmann wrote:

yes but there is software

Noe

[Edited at 2008-05-13 17:14]


Nie

Noone seems to know exactly what yoiu are talking about but I've had a similar experience to Tony and discovered plagiarism using my 6th sense and Google. There IS software, pub houses use it, and if you do a search on Google fori "plagiarism software" there lost of even free softaware.

As a translator I don't use this software, it doesn't seem to fit with what translators are required to do (meaning that I think the eds of pub houses are more responsible for this kind of "policing").

Yet I do take a principled stand, and difficult tho it is, I manage somehow to get out of translating anything that shows evidence of plagiarism.


Direct link Reply with quote
 

Daina Jauntirans  Identity Verified
Local time: 04:22
Member (2005)
German to English
+ ...
Have not run into this May 14, 2008

I have not run into this issue myself in business and financial translation. I could see that this would be useful if authors re-use parts of their own company's materials (that's not plagiarism, though). Often they do this, but don't mention which parts of a report or document may have been taken from other material. In that case, it would be useful to find the other document and a possible translation, since consistency is important.

On a different note, a friend of mine teaches online classes and is required to run all student work through plagiarism software. She has caught people this way.

[Edited at 2008-05-14 00:54]

[Edited at 2008-05-14 00:54]


Direct link Reply with quote
 
xxxhazmatgerman
Local time: 11:22
English to German
Doughty post May 14, 2008

The British press referred to was the Economist AFAIK. Your may be able to use their Website to get hold of the article and start from there. Otherwise let me know an I go through my issues. Good luck.

Direct link Reply with quote
 
Noe Tessmann  Identity Verified
Local time: 11:22
English to German
+ ...
TOPIC STARTER
Results of a test with plagarism finder May 14, 2008

Hello,

sorry for being not clear enough. Here's a result for a 2 pages text about flexicurity.
Some of the links seem to be useful especially those from the EU website, but I am not willing to pay 50 euros for the full version of plagarism finder.


I'll do some more tests.

Regards

Noe


BIBLIOGRAPHY
--------------------------------------------------------------------------------

All sources with a match of at least 100 characters are shown:

http://ec.europa.eu/employment_social/calls/2007/vt_2007_016/tenderspecs_de.pdf # 171 characters
http://ec.europa.eu/employment_social/employment_strategy/flex_meaning_en.htm # 154 characters
http://www.springerlink.com/index/6LCFUTWCC6HM5QWE.pdf # 140 characters
http://www.ingentaconnect.com/content/sage/j279/2003/00000025/00000005/art00012 # 140 characters
http://www.mt-archive.info/LREC-2000-Cucchiarini.pdf # 140 characters
http://64.233.183.104/search?q=cache:e-u5lKJH4XEJ:www.mt-archive.info/LREC-2000-Cucchiarini.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=2&gl=de # 140 characters
http://www.tcd.ie/Germanic_Studies/localpages/bsg/texts/course/bsghandbook03.htm # 140 characters
http://www.wesc.ac.uk/tc/te-wcrwesc.pdf # 140 characters
http://files.idiominc.com/Globalization2020-MultilingualComputing.pdf # 140 characters
http://64.233.183.104/search?q=cache:JCLHLMo6R_IJ:files.idiominc.com/Globalization2020-MultilingualComputing.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=5&gl=de # 140 characters
http://64.233.183.104/search?q=cache:B-bifCymmGYJ:www.ist-world.org/ProjectDetails.aspx?ProjectId=38d58da3d3234f4bb917e249266a319e%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=1&gl=de # 140 characters
http://64.233.183.104/search?q=cache:tZKuXzj8a8sJ:language123.com/l/working_as_a_translator.html%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=7&gl=de # 140 characters
http://language123.com/l/working_as_a_translator.html # 140 characters
http://64.233.183.104/search?q=cache:rZDzGFnamhYJ:ec.europa.eu/translation/reading/articles/pdf/2001_03_30_brussels_goetschalckx.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=8&gl=de # 140 characters
http://ec.europa.eu/translation/reading/articles/pdf/2001_03_30_brussels_goetschalckx.pdf # 140 characters
http://64.233.183.104/search?q=cache:YJqrSGHEc6AJ:www.tcd.ie/Germanic_Studies/localpages/bsg/texts/course/bsghandbook03.htm%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=6&gl=de # 140 characters
http://64.233.183.104/search?q=cache:ca4idSMoFkIJ:www.wesc.ac.uk/tc/te-wcrwesc.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=9&gl=de # 140 characters
http://www.ist-world.org/ProjectDetails.aspx?ProjectId=38d58da3d3234f4bb917e249266a319e # 140 characters
http://64.233.183.104/search?q=cache:WLzQSvKrU7UJ:www.air.org/news/documents/AERA2005Test%20Translation%20Advantages.pdf%20"Social%20Partners%20-----------------------------------------------End%20of%20translation%20test------------------------------------------%20Components"&hl=de&ct=clnk&cd=10&gl=de # 140 characters
http://www.air.org/news/documents/AERA2005Test%20Translation%20Advantages.pdf # 140 characters
http://ec.europa.eu/employment_social/news/2007/jun/flexicurity_en.pdf # 104 characters
http://papers.ssrn.com/sol3/Delivery.cfm/SSRN_ID1118725_code983355.pdf?abstractid=1118725&mirid=1 # 104 characters


--------------------------------------------------------------------------------
RESULT OF THE EXAMINATION
--------------------------------------------------------------------------------

number of words in document 736
therefrom examined words 98
therefrom congruent words
found in the Internet 90

record length 7
increment 50

so that 13 % of all words have been examined
a total of 12 % congruent words
have been found in the Internet
from all examined words this is a total of 92 % congruent words
found in the Internet


Direct link Reply with quote
 
FarkasAndras
Local time: 11:22
English to Hungarian
+ ...
interesting idea May 14, 2008

I just google characteristic bits and go from there (one job contained a very strange word which is not in any dictionary I have access to. I googled it and there was only one hit... the document where my the bits that made up my job came from. The site contained the official translation as well.)

Automating this could come in handy.

I just had a look and there are a couple of free programs, like this one: http://www.plagiarism.phys.virginia.edu/

No idea if any of them are any good, never tried any.


The "rough online alignment" bit is unlikely esp as you need to find a translation as well as your original document.
But I just did a sizable alignment project (73000 TUs) and can vouch for hunalign. If you have a word pair dictionary of your language pair and the originals are mostly in synch it does a remarkable job even without human intervention. Most of the time it detected correctly when one of the texts had 5 paragraphs missing!
With precisely formatted and exactly matching input texts like the europarl corpus you just throw the text at it and get a 99.9% correct automatic alignment.


Direct link Reply with quote
 

Allesklar  Identity Verified
Australia
Local time: 20:52
Member (2005)
English to German
+ ...
iMacros May 15, 2008

If you are inclined and able, iMacros could be some help with this.

I haven't looked into it properly yet, but the commercial version also has a scripting interface which could theoretically be used to link a Google search macro to an alignment application.

Don't know whether the potential benefits would justify the trouble, but certainly interesting to play around with.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 11:22
Member (2006)
English to Afrikaans
+ ...
Yes, but you can't use the translation, can you? May 15, 2008

Noe Tessmann wrote:
...and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.


What good would it do you if you find an existing translation? You can't use someone else's translation except for terminological purposes. Copyright of exiting translations belong to their respective translators, even if the client owns copyright or has a licence to use the source text.


Direct link Reply with quote
 

Alex Eames
Local time: 10:22
English to Polish
+ ...
Good point Samuel May 15, 2008

Samuel Murray wrote:

Noe Tessmann wrote:
...and sometimes it would be helpful to know if there are existing translations in your language for exemple on the EU website before you start working.


What good would it do you if you find an existing translation? You can't use someone else's translation except for terminological purposes. Copyright of exiting translations belong to their respective translators, even if the client owns copyright or has a licence to use the source text.


Good point! If you then use this to plagiarise the text you have found, this software might be used against you and you end up with egg on your face at best and at worst a breach of copyright lawsuit. (Of course whether or not you get caught totally depends on the use to which your translation will be put, but the question remains is it morally acceptable to copy large chunks of text without permission?)

I remember a situation once where we were translating a long document issued by the Polish State Treasury. It was a prospectus showing the tendering/bidding procedures for building power stations. By coincidence, we found out through an associate that they were translating the same document for another client. Happy days. We pooled resources and shared the work. No copying issues in this case, but it was somewhat exceptional.

Alex Eames
http://www.translatortips.com/
helping translators do better business


Direct link Reply with quote
 
Noe Tessmann  Identity Verified
Local time: 11:22
English to German
+ ...
TOPIC STARTER
I am a plagiarist May 15, 2008

Hello,

thanks for your input, but I am a plagiarist I don't reinvent phrases like "turn the screw to the left". I borrow formulations from Wikipedia even from Windows glossaries. I know you're the good guys you would never do this.

Actually I translate a lot of EU related texts so I even have to copy entire paragraphs from white or green papers, press releases, communications, quotations and so on. I am soooo evil. Sometimes they're hard to find and a plagiarism detector would be helpful for me.

Kind regards


Noe


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maria Castro[Call to this topic]

You can also contact site staff by submitting a support request »

Plagiarism Scanner as translation ressource?

Advanced search


Translation news





SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs