world wide lexicon - seeking volunteer translators?
Thread poster: Jane Lamb-Ruiz
I suggest everybody take a look at this site. http://picto.weblogger.com/
To build a worldwide lexical database is a good idea but to get \"volunteers\" around the world to translate \"texts for free\", is ridiculous.
\"To participate, volunteers will download a presence awareness program. When the user is online, but apparently not busy, he or she will occasionally be prompted to provide definitions or !!!!translations!!!!to participating WWL servers. Even if only a small percentage of Internet users participate in this project, the system will be able to capture millions of new definitions per month.\"
And here\'s a PC World article that further elucidates this initiative:
I am interested in people\'s reactions to this.
| One big question - volunteers || Jun 22, 2002 |
The whole basis for this project is volunteers. First, they don\'t mention how they intend to get volunteers. Are they going to magically appear out of the woodwork? Second, even if they get volunteers, they don\'t mention how they intend to verify correctness of the translations.
If we look at Kudoz, the system works well because there is a \"reward\" for helping others. There is also a mechanism for verifying correctness (which also is rewarded with \"Browniz\"). Trying to build a translation system entirely on the good will of those who make this their profession will ultimately fail.
I\'m all in favour of having a worldwide database of terms (something like Eurodicautom). However, trying to take it to the next level (a translation service) is not feasible.
| Voluntary translation work || Jun 22, 2002 |
Sounds like a good project for people getting started in translation, for one thing. Voluntary work is a good way to establish a work record. The PC World article clarifies that there will be a built-in quality control process.
To access Jane\'s link to the PC World article, cut and paste the whole line.
\"The dictionary servers collect these translations, compare them with other responses, and determine whether to add the term to their databases. Over time, the translator volunteers would be assigned a kind of reliability rating that would give their responses more or less weight based on past accuracy.\"
| | Lise Smidth
Local time: 00:23
German to Danish
| translating the internet through kudoz??? || Jun 22, 2002 |
quote from the site:
One of the projects we are working on is a document translation service that relies heavily on distributed human computing. This system works by grabbing documents on the internet, slicing them into many small texts, and then assigning each small block of text to one or more human translators. Other translators are asked to score recent submissions from their peers. The translated texts are then stitched together to create a completed text.
This service will not replace traditional document translation services, but rather will provide a more cost effective way to process documents that require rapid processing and good translation. The quality of the translations will be better than machine translation, though not perfect.
wow - i can just imagine how much better than machine translation that would be
btw not sure if translators are to be payed for this service or if it\'s part of the voluntary work...
| Response to my email from the site creator || Jun 22, 2002 |
Please note the payment scheme is lightly \"embedded\" in the answer.
Thank you for your comments.
The document segmentation system is NOT intended to replace existing methods, but rather to provide a lower cost alternative for documents that must be processed rapidly and at a lower cost. Obviously, a text that requires serious study (such as a short story) is not suitable for this type of translation. This is primarily intended to be used for texts that have a short useful life span (news stories, pages on websites, etc). I am approaching this as an economic (resource allocation) problem. There is far more information being published on the net than there are people to translate. So I am interested in finding ways to process as much relevant information as possible while providing good, though not perfect quality. The idea is to provide something that is better than machine translation, but less expensive (and faster) than conventional professional translation.
One of the main benefits of the segmentation approach is that it allows for much faster processing because a relatively large document can be divided up, with each chunk being processed in parallel. So instead of waiting for one person to comb through the entire text in serial fashion, you have many people working on different parts of the text concurrently.
I understand your point, but I am also skeptical of claims that \"only professionals can do this\". I used to work in academia, and one of the reasons I left was because perfectly competent people were not able to find work because they did not have sufficient credentials. For example, I have many friends who are bilingual or multilingual. While they are not proficient enough to translate difficult texts, they are certainly good enough to translate other types of texts, and if they were participating in something like this on a regular basis, would also improve with time and practice. It\'s also worth noting that there is a large population of people in developing countries who will be interested in this as a source of work.
If you\'ll take some time to look through the specifications, you\'ll see that I have given a lot of thought to quality control issues, and how to use techniques such as reputation scoring and randomized peer review to catch and correct errors and false input. One of the interesting things about computers, is they can take a large number of factors into account when making decisions. In the context of segmented document translation, the server will use factors such as the following in making its decisions:
* is the contributor a trusted user (e.g. site manager vouches for reliability of translator, subjective score)
* what is the contributors aggregate peer review score from other users
* what is the randomized peer review score for a specific block of translated text
* how urgent is the document translation job (e.g. more urgent = less stringent peer review)
* how much is the client willing to pay per kilobyte (e.g. high price = more stringent peer review per text)
* how much is a given translator demanding to be paid per workunit
* how difficult is the text being translated (subjective score, difficult text = more peer review)
One other key point is that the GNUtrans system will not hide the rest of the document from the translator. The translator will see the surrounding areas of the document (the text he/she has been assigned will be highlighted), and will also be able to click through to the source URL for the complete text. So, the translator will be able to see the context in which the sentence or paragraph exists within the larger document.
So, there is a lot more to the system than simply slicing a document into blocks and blindly sending it off to unknown translators. The first version of the system will be fairly simple, but with time, I expect that it the programs will become more and more sophisticated in the way they behave.
Thanks for your comment. I hope this answers your question/critique. Please email me anytime, as I am interested in critiques from the professional translation community.
| || || |