Pages in topic:   [1 2 3 4] >
Bug in contest code led to "leak" of entries in a handful of pairs
Thread poster: Henry Dotterer

Henry Dotterer
Local time: 05:04
SITE FOUNDER
Mar 26, 2008

Dear members,

I have some unfortunate news to report to you all. We have learned that due to a programming error, the integrity of the current ProZ.com translation contest has been compromised in several of the contest's language pairs. Most pairs were not affected, but a few pairs were.

Lest there be any misunderstanding, what happened this time around is in no way the fault of site members - it is purely an error on the part of site staff and developers, ie. my team, and for that, I take full responsibility. What I am reporting here does not involve an attempt on anyone's part to cheat.

Also, this issue is particular to the current contest, contest #6. There have been other issues in past contests, but those were in no way related to this one,which would have had no effect on past results.

To explain: Briefly, due to a programming error that took effect March 5, non-members who visited a certain page (the page for submitting entries in certain pairs) were displayed contest entries that had been submitted by others--even while submission was still possible.

As you may know, non-members can not submit entries. Still, the fact that entries were "leaked" while submission was still possible in some pairs calls into question the validity of entries that were subsequently submitted or edited (since there could have been communication of some form between the owner of a non-member profile and the owner of a member profile.)

We have analyzed the situation carefully on a pair by pair basis, and here is what we have found:

* In 26 of the 98 language pairs represented in the contest, everything is ok. The submission phase was not extended and others' entries could not be viewed--so the integrity of the contest was not compromised. Finals round voting will proceed as normal in these pairs. The option to vote will appear shortly.

* In an additional 50 of the 98 language pairs, effects from this bug can be ruled out based on analysis of site logs, either because no one ever viewed others' entries, or because there were no entries submitted or edited after such viewing occurred. In these pairs, too, final round voting can occur.

* In 3 of the 98 pairs (English to Macedonian and Tamil, Italian to Albanian), entries were viewed, but due to an insufficient number of entries final round voting will not occur.

That leaves 19 "affected" pairs in which entries were viewed by at least one non-member and changes or new submissions were made subsequently by at least one member. In these pairs, a decision has to be made whether or not to hold final round voting.

We have reviewed activity on a user by user basis in these pairs, and the situation varies from pair to pair. Specifically:

* In 8 of the 19 "affected" pairs, there is reasonable evidence to suggest that no contestant benefited from access to any other's entry. In these pairs, the page that errantly displayed others' entries was viewed by just one or two people, and site staff members succeeded in contacting all of them over the last few days. Based on an analysis of the viewers' site activity, and confirmed by their replies, it does not appear that the content of the entries viewed was accessible to other contestants. Supporting this conclusion is the fact that in most of these pairs, a low number of entries was edited or submitted after the viewing of entries, and the timing of such edits did not correspond to the timing of the viewing. In short, although unfair advantage can not be ruled out entirely, it does appear unlikely. The pairs are:

English Albanian
English Catalan
English Indonesian
German Romanian
German Spanish
Italian Romanian
Romanian Italian
Russian Spanish

In the above pairs, barring strong objection from members here, the decision would be to hold final round voting as normal.

* In another 5 of the 19 "affected" pairs, the same situation is similar to the above (one or two viewers, logs suggest unfair benefit is unlikely). However, in these five pairs, it has not yet been possible to confirm with the viewers that they have not communicated what they saw to anyone. Those pairs are:

English Bulgarian
English Slovak
English Ukrainian
English Vietnamese
Russian German

In these pairs, we will continue to try to reach the few people who viewed entries. Pairs in which it can not be confirmed that the viewers did not communicate with anyone will be moved into the next ("hybrid") category.

* Finally, in the last 6 of the 19 "affected" pairs, we have what could be called a "hybrid" situation. In the case of entries that were submitted before the time that others' entries might first have been viewed (37 of 68 entries), and were not edited thereafter, it can be assumed that there was no unfair advantage. On the other hand, in the case of entries that were either submitted or edited after others' entries might first have been viewed (31 of 68 entries), while unfair advantage is unlikely it can not be ruled out entirely. Those pairs are:

English Arabic
English Greek
English Hungarian
English Japanese
English Polish
English Turkish

In these pairs, the integrity of the contest has at least been put into question by the bug.

Furthermore, we now know that in the English Turkish and English Polish pairs, there was in fact communication between at least one member and one non-member (in each pair) about entries that had been leaked.

We do not know whether or not such communication occurred at all in the other four pairs.

A decision must now be made as to how to proceed in the six pairs that are in the hybrid situation. The decision as to how to proceed will be made on the basis of members input received in this discussion thread, and offline as well.

Site staff members have identified the following three possible options for the pairs in the "hybrid" situation:

Option 1: ("Asterisk" approach) Proceed to finals round voting with all entries allowed - but do not consider it an "official" contest. A note would appear in the page, and winning information would not go to profiles.
Option 2: ("Selective" approach) Proceed to final round voting with only those entries that are known with certainty to be unaffected.
Option 3 (added later): ("Modified selective" approach) Ask those who had the benefit of others' entries or discussion about them to voluntarily remove their entries. Proceed to final round voting with all remaining entries, making clear in which cases access to other entries was impossible.
Option 4: ("Cancellation") Cancel the contest in the "hybrid" pairs.

Note that: in Japanese and Turkish there are 1 and 2 entries, respectively, that could not have been influenced by the leak. As these numbers are below the minimum number of entries required for final round voting, in these pairs, only options 1 and 3 exist.

Obviously none of these solutions is ideal, and none is fair to members who have participated in this contest. These members -- through no fault of their own -- are now deprived of the opportunity to compete (and potentially win) a contest known to be fair. Therefore, to compensate those who submitted entries in these 6 pairs, a half year of membership will be granted in the form of an extension to the current membership period.

We recognize that this compensation will be small consolation to those who have been inconvenienced due to an error on the part of site developers and staff members. Once again, on behalf of the site staff, I apologize to all of you for this serious error and promise that we will take the steps necessary to ensure the integrity of the contest in future rounds. Although contests were announced as a "just for fun" activity, we view it as our responsibility to ensure that they are completely fair. We have let you down in this respect, and we regret that.

I await your opinions as to how best to proceed in the 6 "hybrid" pairs. Weight will be given to the views of people who actually participated in those pairs...

Which of the three options do you prefer?

If more information is required, we can provide, on a pair by pair basis, information related to the number of people who viewed other entries, the time viewed, number of entries known to be unaffected, etc.

Henry


Direct link Reply with quote
 

Murat Uzum  Identity Verified
Local time: 12:04
English to Turkish
Local Staff Mar 27, 2008

Dear Henry,

Thank you for the information regarding to this bug. First of all I'm sorry to hear that Turkish language was involved to this category. Anyway I think there are some ways to find out the cheaters who benefitted from this bug.

For Turkish on 5th March there were just 5 entries and as I know the number was same on 18th March, I don't know if new entries are made on final deadline date, 26th March. But let's say there are 10 entries in total. And at least we know that every edit and submission attempt is logged on the site including a change of a comma.

If possible local staff or local representatives are able to evaluate these 10 entries. As you mentioned the entries made until the first deadline date 5th March weren't viewed and this bug occured after this date. So these 5 entries seem to be clean and not affected with this bug unless extensive edits are not made until 26th March for these entries.

Maybe the entries after this date and edits made on the submitted texts can be analysed and the cheaters should be found by this criteria.

I'd like to remind that I just wanted to contribute to the organization, even winning or loosing wasn't very important for me when submitting my text. Now I just wish Turkish language to be acquitted from wrongdoer(s)'s acts.

Best Regards,
Murat Uzum


Henry D wrote:

Dear members,

I have some unfortunate news to report to you all. We have learned that due to a programming error, the integrity of the current ProZ.com translation contest has been compromised in several of the contest's language pairs. Most pairs were not affected, but a few pairs were...


Henry


[Edited at 2008-03-27 05:39]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 11:04
Member (2006)
English to Afrikaans
+ ...
Asterisk +1, but... Mar 27, 2008

Henry D wrote:
Option 1: ("Asterisk" approach) Proceed to finals round voting with all entries allowed - but do not consider it an "official" contest. A note would appear in the page, and winning information would not go to profiles.


Any indication about what the wording of the note would be? I can imagine it must be difficult to phrase such a note so that it doesn't appear that the contestant was potentially one of a number of cheaters. I mean, I understood your original post to mean that there was no deliberate dishonesty, yet the first replier to your message interpreted your post to say "there were cheaters".


Direct link Reply with quote
 

Henry Dotterer
Local time: 05:04
SITE FOUNDER
TOPIC STARTER
Details of English Turkish Mar 27, 2008

Murat Uzum wrote:
Thank you for the information regarding to this bug. First of all I'm sorry to hear that Turkish language was involved to this category. Anyway I think there are some ways to find out the cheaters who benefitted from this bug.

It seems to me that I may not have been clear enough. It is not fair to say that anyone "cheated" in relation to this problem. The situation is that if a person has edited his or her entry after the time someone else viewed others' entries, from an outsider's point of view, there is no guaranteeing that that person has written his/her translation without the benefit of some information about the other entries (because there is an outside chance that the contestant has communicated with the person who saw the entries). In other words, the simple timing of an edit would put the entry into question - through no fault of the contestant.
For Turkish on 5th March there were just 5 entries and as I know the number was same on 18th March, I don't know if new entries are made on final deadline date, 26th March. But let's say there are 10 entries in total. And at least we know that every edit and submission attempt is logged on the site including a change of a comma.

Unfortunately, in this case we do not have a record down to the comma. For contest entries, we know whether or not there was an edit (a "submit"), and we know the time of the edit, but we do not know exactly what changed.

In English Turkish, there were five entries. As you note, all five entries existed before the first non-member visited the page that showed entries (that was at 8:26pm (GMT) on Mar 11, 2008). After this time, three of the English Turkish entries were edited. (This is not unusual; many people have edited their entries in all pairs.) So we have 2 entries that could not have changed as a result of the viewing, and three that could have.

Given this, how do you think we should proceed, Murat?


Direct link Reply with quote
 

Henry Dotterer
Local time: 05:04
SITE FOUNDER
TOPIC STARTER
Thoughts on the "asterisk" approach Mar 27, 2008

Samuel Murray wrote:
Henry D wrote:
Option 1: ("Asterisk" approach) Proceed to finals round voting with all entries allowed - but do not consider it an "official" contest. A note would appear in the page, and winning information would not go to profiles.

Any indication about what the wording of the note would be? I can imagine it must be difficult to phrase such a note so that it doesn't appear that the contestant was potentially one of a number of cheaters. I mean, I understood your original post to mean that there was no deliberate dishonesty, yet the first replier to your message interpreted your post to say "there were cheaters".

Right, this is clearly a drawback to this approach. If a sufficiently neutral note can not be found ("Unofficial contest"?), perhaps we could go with no note, the "asterisk" element being only that the results do not appear in profiles or in the overview page for the contest.

Just thinking out loud. What do you all think?


Direct link Reply with quote
 

Dagmara Kuliś  Identity Verified
Belgium
Local time: 10:04
English to Polish
+ ...
Selection Mar 27, 2008

First of all, thanks for such a detailed report on the bug. It is indeed quite annoying but that's life in our worlds - such things happen.
Second, as I have submitted my entry for English-Polish, I'm a bit concerned about the decision what next. As far as I remember, for some time there was a steady number of entries - but I can be wrong in that. Anyways, I would suggest the "selective" option - making sure which entries have not been altered and allowing those to participate.
If not, then I would opt for the "asterisk" option. And well, now all we can do is wait for a decision regarding this contest and then wait for a new contest - hopefully without any bugs.


Direct link Reply with quote
 

Roman Bulkiewicz  Identity Verified
Ukraine
Local time: 12:04
Member (2004)
English to Ukrainian
+ ...
minimize the unjustice... Mar 27, 2008

...if you cannot avoid it.
I understand this was the staff's approach as you tried to sort out the "affected" language pairs and allow as many as possible of them to proceed to the voting stage.

From this perspective, the "selective" option looks the best.

Henry D wrote:
Option 2: ("Selective" approach) Proceed to final round voting with only those entries that are known with certainty to be unaffected.


This, essentially, amounts to "rolling back" to the original submission deadline, more or less.
This way, at least, those contestants who submitted their entries within the original timeframe will not be offended. After all, the "late submitters", leak or no leak, did have an advantage - one of the extra time. (Perhaps this practice of extending the submission phase should be reconsidered in the future contests, by the way.)

Now, to extend this principle of "minimum unjustice", may be the "selective" approach should be modified to allow the "affected" submitters to choose between two options:

1) withdraw their entry from voting (with or without any "compensation");
2) proceed their entry to the voting stage but have it marked as "submitted after the possible leak" (or smth like that, phrased as objectively as possible), PLUS allowing them to hide their identity in case they don't win (despite their previously made choise).

This way, the peers who consider voting for an "affected" entry will be given an opportunity to double-check, by comparing it with all the other entries, whether or not its superiority may have resulted from the possible "leak". Thus, the unjustice will not be eliminated, but will be decreased: the chances of a good-faith late submitter to win may be lower with the additional scrutiny but will not be brought down to zero as with the original "selective" approach. And, if the "labeled" entry does win, it will be a victory beyond doubts. On the other hand, if it doesn't win, the contestant will remain anonymous and thus not affected by (unjust) suspicions.


Direct link Reply with quote
 
Minoru Kuwahara
Japan
Local time: 18:04
English to Japanese
+ ...
English to Japanese, mine was the first entry Mar 27, 2008

*** English to Japanese ***

Hello Henry and all site staffs in charge of "ProZ.com Translation Contests",

Thank you for clarifying the situation. Referring to "English to Japanese" which seemed to be unfortunately included in the "hybrid" pairs, I posted first for this language pair and did so well in advance before all the other posters. Truly I have not noticed this leakage issue at all for the entire open stage, when I was constantly accessing the Contests page to read and revise my initial entry a number of times over the past weeks. As indicated on the page, I remember the second poster submitted his/hers some time after my submission and all the other 3 followed almost at around the closing period, in the last 1 or 2 weeks before the initially set deadline. From this, I may have to be aware my entry, which was first, might have been exposed to leakage most probably at a highest level in this pair.

>>>>>If more information is required, we can provide, on a pair by pair basis, information related to the number of people who viewed other entries, the time viewed, number of entries known to be unaffected, etc.

I would definitely like to know this log information if available. Also, I'm naturally curious who actually could take a look at others' entries, and even though I don't like to doubt anyone, who might even have submitted their entries (or later edited?) while viewing other posters' entries, only if any, that is. This information would be forwarded to us individually in any form offline.

And if it's not a big trouble for moderators, I expect all the posters of this language pair will be contacted, too, to verify if they firstly fully knew the rules for entry submission and if they actually did not "see" others' before they considered their own. In this respect, we may request them of fair and honest replies. This means a "selective" option may have a possibility to work.

With no other suggested Options than 1 and 3, I would vote for Option 1, but like others here say, there could be an officially selective voting stage. All posters for this pair should be questioned if they complied with the Contests rules (never looking at other posters' submission at any time), and for any "honest" answerers who admitted viewing leaked entries, theirs will be out of the pool for voting. Sorry about those posters, but I think this is just a fair treatment. On the other hand, some kind of note to indicate "leakage occurred" may well be put along with all entries since even honest answerers could not be in fact authenticated. I think this could be a better option rather than canceling out the "contesting" aspect.

By the way, I guess selecting only one winner per language pair may not be necessarily best as it is. For example, there could be a ranking system of any sort to appreciate each poster's translation on different levels or categories (styles, vocabulary, phrasing, clarity, reader-friendliness and so on) so that multiple posters would win on some while losing on others, etc. Actually I have some opinions about the Contests, but I'm not going to say all of it here as it should be a different issue.

[2008-03-27 10:07に編集されました]


Direct link Reply with quote
 

Murat Uzum  Identity Verified
Local time: 12:04
English to Turkish
Cheating-Copying Mar 27, 2008

Henry D wrote:

Murat Uzum wrote:
Thank you for the information regarding to this bug. First of all I'm sorry to hear that Turkish language was involved to this category. Anyway I think there are some ways to find out the cheaters who benefitted from this bug.

It seems to me that I may not have been clear enough. It is not fair to say that anyone "cheated" in relation to this problem. The situation is that if a person has edited his or her entry after the time someone else viewed others' entries, from an outsider's point of view, there is no guaranteeing that that person has written his/her translation without the benefit of some information about the other entries (because there is an outside chance that the contestant has communicated with the person who saw the entries). In other words, the simple timing of an edit would put the entry into question - through no fault of the contestant.
For Turkish on 5th March there were just 5 entries and as I know the number was same on 18th March, I don't know if new entries are made on final deadline date, 26th March. But let's say there are 10 entries in total. And at least we know that every edit and submission attempt is logged on the site including a change of a comma.

Unfortunately, in this case we do not have a record down to the comma. For contest entries, we know whether or not there was an edit (a "submit"), and we know the time of the edit, but we do not know exactly what changed.

In English Turkish, there were five entries. As you note, all five entries existed before the first non-member visited the page that showed entries (that was at 8:26pm (GMT) on Mar 11, 2008). After this time, three of the English Turkish entries were edited. (This is not unusual; many people have edited their entries in all pairs.) So we have 2 entries that could not have changed as a result of the viewing, and three that could have.

Given this, how do you think we should proceed, Murat?


Dear Henry,

By cheating I tried to emphasize copying from another person which leads an advantage for himself/herself in terms of ethics. I wouldn't want it to cause any misunderstanding though.

I had to submit my entry for two or three times as I remember since I thought system didn't receive it. But I'm sure I didn't change any comma of it, so it'd be better as if all the entries with the edits were logged into the system record but seems system doesn't have them for the moment. Otherwise the original texts that were sent until March 11 could be involved into the contest. as Roman mentioned. Anyway I respect to any decision that will come out as a result.

For the future contests I'd suggest texts to be sent via email so that there won't be any bad consequences like that in the future.

Best Regards
Murat Uzum


Direct link Reply with quote
 

Roland Nienerza  Identity Verified

Local time: 11:04
English to German
+ ...
Combined approach - Mar 27, 2008

Dear Henry, -

it does you great honour to have taken this amount of trouble to check out the implications of a small unguarded breech of confidentiality for entries already submitted before the input phase was over.

As you have narrowed down the possibility of some very hypothetical abuse to just 6 pairs - of which only two had a somewhat higher number of attendance - I would propose a combination of option 1) and option 2) - meaning to have all entries in voting, and to apply the asterisk only if a winner comes out of the "inner circle" of possible beneficiaries of the uncertainty.

On the other hand - as has been certainly observed by other contributors to the contest and definitely by me, the contest is not and has not a mathematical procedure to establish the best translator. I have seen in my and in some other pairs rather significant flaws in winning pieces. So much that in a certain way, and without much of irony or acrimony, one could name the contest also as the "ProZ.com Translation Contest Lottery" - or "ProZ.com Contest about Turning a Translation into a Fairy Tale" etc.

Some do their translation completely on their own. Some do them in a team, and mention one or more contributors in it. - And some, if not many, do them in a team without mentioning any other contributor. - I have seen an email of a winner in a team, who had put under the signature of his mail, as promotion tag, "Winner of ProZ.com Contest". Forgetting to put the prefix Co- before Winner. - Ach ssso.

This being said, I repeat that I recognize the effort of the organizers of the contest to offer equal chances to everyone.

And therefore I have do draw your attention to another "potential bug" in contest management, that I only noticed this time, but I think had existed before.

When the qualifying vote starts, there is in several pairs the situation that the quorum is not yet reached - and submission as well as **editing** is still possible, while in other pairs, based on that same source text, voting is already on. That means, e.g., that while Arabic Urdu and Arabic Pashtu might already be in the voting, and accessible at least to members, even if they are not allowed to vote, submission and editing are still open for Arabic Vietnamese and Arabic Swahili. -

Now, if a contributor or would-be contributor to the last two pairs, or any other pair in this situation, has a good or even just some knowledge of Urdu or Pashtu, he could well glean some inspiration from the Arabic Urdu Arabic Pashtu or other pairs already visible. - This is a somewhat less direct way to profit from others than with non members "possibly" conveying information to members, straight away in the same language pair. It may be considered as something like Google or other research. - But it could also be seen as an "eaves-dropping" possibility that might have to be addressed in the future.


Direct link Reply with quote
 

Courtney Sliwinski  Identity Verified
Local time: 11:04
German to English
+ ...
Selective Approach Mar 27, 2008

Dear Henry,
I think your suggestion of using a selective approach is the most feasible. It minimises the amount of unfairness, but eliminates those entries that were questionable. Roland makes a good suggestion, but then if one of the questionable entries were to win, then everyone would probably suspect them of cheating. This could be damaging to their reputation. I think the selective approach is best. It would avoid this type of embarrassing situation, and still allow those who truly put forth a great deal of effort to participate in the contest. As you said, we aren't specifically talking about cheating, so it would not be fair if these participants were labelled as such.


Direct link Reply with quote
 

Pavel Tsvetkov  Identity Verified
Bulgaria
Local time: 12:04
Member (2008)
English to Bulgarian
+ ...

MODERATOR
The orgnization of the current contest was not too good Mar 27, 2008

I am sorry to say that - but the organization of the current contest (and this is my first participation in proz.com contests) leaves much to be desired.

First there was a deadline which was later changed. This leaves a bad aftertaste. By definition a deadline is a final line, one that cannot be crossed. Observing this deadline affects one's work, the quality of his/her translation, etc. Those who had not submitted an entry when they must have, were given a second chance - and that sounds like a good and noble thing to do, however it was not fair to those who had observed the rules of the contest in the first place. Yes, I was given the opportunity to go back and review/change my work, but psychologically I was put in a position of disadvantage, as it felt strange to alter what I had felt was fit to win just a day or two earlier. Now, if I had initially known that the deadline would be different, I might have reconsidered my approach and submitted a different translation altogether.

A second deadline was then issued for new submissions and voting in some of the pairs. The final tour of voting was supposed to start yesterday. It did not.

Now we learn that the whole process of additional submission and editing could have been compromised - theoretically - but no one really knows (and most probably they never will). No new deadline has been issued as the organizers are obviously not sure at this point how to resolve the situation.

Quite frankly, I do not feel very concerned about the theoretical possibility of someone seeing something, calling somebody else – and this latter person changing something as a result. It is not impossible, but it is difficult and not likely. And if someone has taken advantage of the situation, well, it is a question of moral values and self respect.

However the issuing of new deadlines - time and again - is what I find unacceptable.

I like www.proz.com a lot, but in this specific case they could have done better.

[Edited at 2008-03-27 12:38]


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 11:04
Member (2006)
English to Afrikaans
+ ...
Irrelevant issues Mar 27, 2008

Pavel Tsvetkov wrote:
I am sorry to say that - but the organization of the current contest (and this is my first participation in proz.com contests) leaves much to be desired.

First there was...

Now we learn that...

However the issuing of new deadlines - time and again - is what I find unacceptable.


Unless I'm mistaken, this thread is about the bug only, and not about other gripes we may have with how the contest is/was being run. The bug was not caused by the shifting deadlines nor were the deadlines shifted in response to the bug. The deadlines issues is irrelevant here.

Besides, the organisational staff can hardly be blamed for a programming bug, and the programmer can't be held responsible for the organisational staff's decisions about deadlines. The two issues are not related.

Post a new thread if you feel strongly about it (that's my opinion); leave this thread to the bug issue.

[Edited at 2008-03-27 12:43]


Direct link Reply with quote
 

Larissa Boutrimova  Identity Verified
Canada
Local time: 05:04
Member (2006)
English to Russian
+ ...
A quick poll Mar 27, 2008

How about conducting a quick poll and going by the majority?

Direct link Reply with quote
 
Pages in topic:   [1 2 3 4] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Jared Tabor[Call to this topic]

You can also contact site staff by submitting a support request »

Bug in contest code led to "leak" of entries in a handful of pairs

Advanced search






SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs