ProZ.com global directory of translation services
 The translation workplace
Ideas

 
User
Thread poster: xxxncfialho
How to calculate the error percentage?
xxxncfialho  Identity Verified
Local time: 23:21
German to Portuguese
+ ...
Sep 13, 2010

Hi,

I am wondering how would one calculate the error percentage of an text.
Like lets say you have 10000 words and there are 25 erros, is that put into the direct relation and the result would be that there were 0,25% of errors in the text?

Thanks for any insight,

Nat


Direct link Reply with quote
 

Tomás Cano Binder, CT  Identity Verified
Spain
Local time: 00:21
Member (2005)
English to Spanish
+ ...
A wrong approach Sep 13, 2010

To me, you cannot categorise all errors the same. A more complex system should be used, in which you give scores to different kinds of errors. Forgetting a comma in the middle of a sentence cannot be the same as forgetting a word or meaning.

My customers have different ways of categorising the errors, but all of them assign different levels of severity to each kind of mistake.


Direct link Reply with quote
 
xxxncfialho  Identity Verified
Local time: 23:21
German to Portuguese
+ ...
TOPIC STARTER
More info? Sep 13, 2010

Hi Tomás,
thanks for your fast answer, can you tell me where to find out more? Is there something like a international table of weight of errors?
Gracias,
Nat


Tomás Cano Binder, CT wrote:

To me, you cannot categorise all errors the same. A more complex system should be used, in which you give scores to different kinds of errors. Forgetting a comma in the middle of a sentence cannot be the same as forgetting a word or meaning.

My customers have different ways of categorising the errors, but all of them assign different levels of severity to each kind of mistake.


Direct link Reply with quote
 

Attila Piróth  Identity Verified
France
Local time: 00:21
Member
English to Hungarian
+ ...
Some standards Sep 13, 2010

Hi Nat,

There is no single solution that can be cover all situations. Therefore different standards have been developed in different fields.

In automotive engineering translations, the standard SAE J2450 is widely used. This is sometimes applied to other technical fields as well.

www.lisa.org (Localisation Industry Standards Association) also developed its own standard for the localization industry.

The highly recognized exam of the American Translators Association uses a fairly detailed list of error categories and error weights. Their Into-English Grading Standards is a long but very instructive read.

In my experience, the requirement of using a detailed pre-established list of error categories and weights often helps to reduce the usual friction between revisers and translators.

Best regards,
Attila


Direct link Reply with quote
 

Christine Andersen  Identity Verified
Denmark
Local time: 00:21
Member (2003)
Danish to English
+ ...
Context makes a diference too Sep 13, 2010

Thanks for the links, Attila!

I have been thinking I ought to find things like this and read them seriously as this year's Professional Development...

On my own account, I would add that sometimes minor errors can be ignored, while on other occasions the text must be checked and revised until it is really as close to perfect as it can be.

A draft contract or plans for internal use in a company must be clear and accurate, but the odd comma or typo should not raise anyone's blood pressure. It is all going to be revised several times later after all. It is not always worth the time and effort of polishing every last detail.

In other texts accuracy is more critical:

-- Medical records etc. are the obvious case in the work I do. Everyone else will have their own examples.

-- Technical plans of any kind that will be used in engineering, building and manufacturing, where precision is crucial.

Here formatting and the elegance of the style may not always be so important, but the meaning must be precise and unambigious.

The final contract, publicity that represents a company´s profile and is designed to impress customers or clients must additionally be well set out and look attractive.

The style and register must suit the purpose, the target group etc. etc.

It is practically impossible to convert all these considerations into a percentage, and often will not be meaningful either, apart from situations where this is an examination grade or will be used for that kind of purpose.

When I proofread (as it is called) for a client, I deliver a marked text (tracked changes with comments if necessary) and a ´clean´ copy, which I consider ready for use.
I raise issues that may need further clarification, like inconsistent spelling of names,
or where the source text is ambiguous and difficult/impossible to translate, but I normally do not classify the errors.

If asked to tick a checklist, I count typos, grammar and errors that do not affect the meaning as 'not serious', and others as serious, but I leave the client or agency to make their own judgements after that.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:21
Member (2006)
English to Afrikaans
+ ...
You can't just have one percentage Sep 13, 2010


ncfialho wrote:
I am wondering how would one calculate the error percentage of an text. Like lets say you have 10000 words and there are 25 erros, is that put into the direct relation and the result would be that there were 0,25% of errors in the text?


I agree with what many other respondents here have said, namely that the ideal error percentage method would include several error categories and error severities.

However, the more complex an error grading system, the more likely it will fail, because proofreaders aren't paid more money for more complex marking systems. Suppose you pay a reviewer 1c per word to review text and simply mark the errors (without classifying them). If you now introduce an error classification system with 5 types of errors and two grades of severity, it will take that reviewer far longer to review the file. Now imagine an error classification system with 10 types of errors, and 5 grades of severity. Sooner or later the reviser will start to take shortcuts, because time is money. Either he will focus on just two or three types of errors, or he will become more lenient with the translation in order to reduce the number of errors he has to mark up.

So, calculating error percentages without having specified categories of errors is dangerous. But if you are confident that your reviser had marked only real errors (objective errors) and had not made any preferental changes, then I think a set of single-category, single-severity percentages could still be useful.

I would be cautious of the calculation (number of errors) / (total number of words). A better calculation may be (number of sentences with errors) / (total number of sentences).

Another telling calculation may be (number of errors) / (total number of words in sentences (or paragraphs) that contain errors) -- this will give you an indication of the density as opposed to the "amount" of errors. Although... you'd have to decide how to interpret such a result, i.e. would you consider a higher or a lower density to be better?

You can also consider the relative position of the majority of errors. If most errors occur in the last 1/4 of the file, then you can be fairly certain that the translator had run out of time towards the end and had been less careful towards the end of the translation... or perhaps that section was done towards the end of a very long translation session when his concentration began to go.

What I would find interesting is a sytem with three severity grades, namely 1) error that only a language professional or purist member of the public would notice, 2) error that any member of the public would notice and which would give him/her a poor impression about the text, and 3) error that would cause a member of the public to misinterpret the text. Coupled with this would be three error types, namely 1) accidental or incidental error, 2) error that a non-native person might make, and 3) error that can be proven to be an error through reference to an authoritative language guide.

I don't think it is relevant for clients to know whether something is a punctuation error, capitalisation error, grammar error, syntax error etc. What use could that information possibly have for a client? The purpose of error classification is to figure out how bad the text is, and why the error is something to be concerned about. In some languages, punctuation and capitalisation errors may be very bad but in other languages they may be of negligible effect.


Direct link Reply with quote
 

Pablo Bouvier  Identity Verified
Local time: 00:21
German to Spanish
+ ...
How to calculate the error percentage? Sep 13, 2010


ncfialho wrote:

Hi,

I am wondering how would one calculate the error percentage of an text.
Like lets say you have 10000 words and there are 25 erros, is that put into the direct relation and the result would be that there were 0,25% of errors in the text?

Thanks for any insight,

Nat



As it has already been said, you should first categorize the type of errors.
And then, each type of error should be weighted.

To give a sample: A Big Blue did not consider a text omission within a quite wide range of words as a mistake (6000 words), if it did not alter the concept. That is, the weighting factor was zero, because the omitted text did not provide nor subtracted any addtional information to the contenue.


Direct link Reply with quote
 

Bryan Crumpler  Identity Verified
United States
Local time: 18:21
Dutch to English
+ ...
From a methodical and legal standpoint... Sep 13, 2010

I do not agree with the above sentiments:

From a mathematical standpoint, the evaluation system cannot be weighted in a way that does not correspond to the value of each term to which the system is being applied. Otherwise it is a biased or discriminatory system that cannot be reliably described in terms of a percentage unless corresponding to a rubric quantifying the weights of each error classification.

If an agency or client pays you based on a price per word, however, then this is the same as them telling you that the value of each word is equivalent, which nullifies the notion of a weighted system having any relevance based on severity of the errors. In such instances, each error must be construed as having the same value if you wish to affix a mathematical percentage to it.

What's being stated above is only indicative of the fact that calculating an error percentage is not a reliable measure for determining the "quality" of a translation.

An error is an error.


Direct link Reply with quote
 

Pablo Bouvier  Identity Verified
Local time: 00:21
German to Spanish
+ ...
How to calculate the error percentage? Sep 13, 2010


Bryan Crumpler wrote:

I do not agree with the above sentiments:

From a mathematical standpoint, the evaluation system cannot be weighted in a way that does not correspond to the value of each term to which the system is being applied. Otherwise it is a biased or discriminatory system that cannot be reliably described in terms of a percentage unless corresponding to a rubric quantifying the weights of each error classification.

If an agency or client pays you based on a price per word, however, then this is the same as them telling you that the value of each word is equivalent, which nullifies the notion of a weighted system having any relevance based on severity of the errors. In such instances, each error must be construed as having the same value if you wish to affix a mathematical percentage to it.

What's being stated above is only indicative of the fact that calculating an error percentage is not a reliable measure for determining the "quality" of a translation.

An error is an error.






Well, then tell it to Big Blue...

So far I know, they used a Gauss curve ( and this mean percentages ) to know the error frequency in behalf of the number of words for each type of error and then they weighted them in behalf of this curve and some common sense logical criteria.

As I have worked for this Big Blue more than 15 years ago and suddenly they decided to outsource the translation business for no apparent reason to their translation department employees, I do not keep the procedure descryption anymore.

But it is quite clear to me that in a very large number of words some minor errors may be omitted without important consequences, while only one or two higher-order errors makes absolutely not possible to give the conformity to a full translation.





[Edited at 2010-09-13 16:16 GMT]


Direct link Reply with quote
 

Christine Andersen  Identity Verified
Denmark
Local time: 00:21
Member (2003)
Danish to English
+ ...
Yuo amy vrey ewll eb rihgt Sep 13, 2010


Bryan Crumpler wrote:

I do not agree with the above sentiments:

From a mathematical standpoint, the evaluation system cannot be weighted in a way that does not correspond to the value of each term to which the system is being applied. Otherwise it is a biased or discriminatory system that cannot be reliably described in terms of a percentage unless corresponding to a rubric quantifying the weights of each error classification.

If an agency or client pays you based on a price per word, however, then this is the same as them telling you that the value of each word is equivalent, which nullifies the notion of a weighted system having any relevance based on severity of the errors. In such instances, each error must be construed as having the same value if you wish to affix a mathematical percentage to it.

What's being stated above is only indicative of the fact that calculating an error percentage is not a reliable measure for determining the "quality" of a translation.

An error is an error.


Hree si na exmaple fo 101% erros, btu I wold nto coutn ayn fo htem sa srious fi siloated...

Srory, I cuold nto rseist htat


I was once told about a system for evaluating translations for school examination purposes. In those days candidates were not allowed dictionaries or other aids, so they were relying entirely on memory.

A consistent error counted as one error if it was repeated throughout the text. (E. g. wrong gender in French or languages that have gender, spelling mistakes.)
If the error was not consistent - sometimes correct and sometimes not, then it counted as an error every time it was written incorrectly.

I.e. inconsistency and carelessness are worse than simply believing that probaly is the correct way to spell probably, for instance.

In practice, there is a considerable difference between serious erros and teh odd typo.

Commas we all know, can be critical.

The percentage of errors is generally used to assess the performance of a particular translator.

Would you let this person work for you again? is often the question behind the exercise, and then the types of errors may be very interesting.

Can they be remedied by using a spell checker, checking grammar rules, just remembering that the verb is irregular, etc. ?
Or more sleep and fresh air, or getting a colleague to proofread your work when you are in a hurry?

In other cases an agency can look at the type of error and decide that the person has simply not understood the source, or does not know the specialist terminology in the target language.
Long-term study may be needed before that particular person will - if ever - be good at translating that type of text.

An error is NOT just an error if you want to make any use of the information once you have worked it out.

I once helped a fellow-student at college to write an error-free essay, but he was very disappointed with the grade he was given. This was not a translation, but the level of language and the content were so basic that he still only scraped a pass, and did not get the brilliant mark he hoped for.

You are absolutely right in pointing out that calculating an error percentage is not a reliable measure for determining the "quality" of a translation.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:21
Member (2006)
English to Afrikaans
+ ...
Yes, and... Sep 13, 2010


Christine Andersen wrote:
Can they be remedied by using a spell checker, checking grammar rules, just remembering that the verb is irregular, etc. ?


Yes, sometimes the presence of a certain type of error can say a lot about the translator -- i.e. whether he is careful or careless in his work.

But one has to take into account the environment in which he works as well. If the translator is forced to use a CAT tool that has no spell-checker in his language, then 3 spelling errors in 1000 words may indicate a careful translator, but if he used a system that did have a spell-checker, then 3 spelling errors in 1000 words may indicate a careless translator.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Marco Ramón[Call to this topic]
Fernanda Rocha[Call to this topic]
Andriy Bublikov[Call to this topic]
Alejandro Alcaraz Sintes[Call to this topic]
Natalia Volkova[Call to this topic]

You can also contact site staff by submitting a support request »

How to calculate the error percentage?







SDL MultiTerm Extract 2014
Save time by automatically extracting terms. Save 15% on ProZ.com

SDL MultiTerm Extract 2014 allows you to automatically create candidate term lists from your existing documentation. This removes the manual effort involved with traditional terminology creation, allowing you to rapidly add terms to SDL MultiTerm.

More info »
SDL Trados Studio 2014
The leading translation software. Save 15% on ProZ.com

SDL Trados Studio provides translators with all the tools they need for translation, terminology management, review, managing projects, machine translation and more, in one simple and easy-to-use environment.

More info »