Language length in percentage
Thread poster: xxxrma
xxxrma
Jun 14, 2010

Hi everyone,
I would like to know how long languages are - in percentage - compared to English. Does anyone know how to find that out? Does anyone have a list?
**************************************************************
Example:
English is the reference language.
How much longer is Russian then English? 20%? 40%?
How much longer is German then English? 10%? 30%?
How much longer is Arabic then English? 30%? 50%?

!!THE NUMBERS I WROTE AFTER EVERY LANGUAGE IN MY EXAMPLE ARE NOT CORRECT!!
!!I MADE THEM UP TO SHOW YOU WHAT I MEAN!!
**************************************************************

does anyone have any idea how to calculate this? or where to find these numbers? I'm searching for weeks now and found almost nothing. Any suggestion is welcome (books, URLs, articles).

Thanks.

All the best.


Direct link Reply with quote
 

Ioanna Orfanoudaki  Identity Verified
Belgium
Local time: 23:02
French to Greek
+ ...
Your own statistics Jun 14, 2010

Dear colleague,

You can make your own statistics, simply by comparing the word count of the source and the target document from previous translations you have handled. The more translations you use, the more accurate your statistics will be.
I do not translate in the language combinations you mention, but I can tell you from experience that you may find differences, for the same language combination, if you count words or characters, i.e. you may find that, between language X and language Y, language Y has 10% more words but that it has 5% less characters, because the average words in language Y are generally shorter. Up to you to decide how you invoice, based on such data, i.e. per word or per 55 or 60-character line.

HTH


Direct link Reply with quote
 
Marek Daroszewski (MrMarDar)  Identity Verified
Local time: 23:02
English to Polish
+ ...
5 minute search Jun 14, 2010

Try searching Proz.com

http://www.proz.com/forum/linguistics/99068-statistics_about_language_length.html

HTH

Marek


Direct link Reply with quote
 
xxxrma
TOPIC STARTER
thanks for your replies Jun 14, 2010

@Marek: this article is the only hint I found about this matter. I need some literature or concrete statistics on this subject. I want to use this information in order to create a QA testing environment.

@Ioanna: as I said, I need some general statistics so that I can create a QA test.

thanks. all the best.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 23:02
Member (2006)
English to Afrikaans
+ ...
Use a universal piece of text Jun 14, 2010

rma wrote:
Does anyone have any idea how to calculate this?


Take a piece of text that is universally translated, such as the Universal Declaration of Human Rights, or Luke 1 (longest chapter in the New Testament of the christian bible) or suchlike.

==

I've taken Luke 1 from http://www.biblegateway.com/ and created the following table for you:

| Language | Words | Characters with spaces | Characters without spaces |
| Albanian | 9825 | 11711 | 1977 |
| Amuzgo | 13959 | 15853 | 1978 |
| Arabic | 9782 | 11107 | 1337 |
| Bulgarian | 8096 | 9629 | 1627 |
| Bulgarian2 | 8141 | 9836 | 1708 |
| Cakchiquel | 15121 | 18104 | 3003 |
| Chinanteco | 13344 | 15687 | 4194 |
| Chinese-simp | 8762 | 11230 | 2547 |
| Chinese-trad | 8761 | 11232 | 2541 |
| Croatian | 8184 | 9687 | 1517 |
| Czech | 7753 | 9163 | 1423 |
| Danish | 8627 | 10433 | 1821 |
| Dutch | 9255 | 11104 | 1862 |
| English | 9725 | 11690 | 2015 |
| English2 | 9502 | 11352 | 1896 |
| French | 9202 | 10917 | 1809 |
| French2 | 10379 | 12350 | 2105 |
| German | 10680 | 12708 | 2041 |
| German2 | 9168 | 10843 | 1769 |
| Haitian Creole | 8873 | 10878 | 2099 |
| Hiligaynon | 10823 | 12967 | 2158 |
| Hungarian | 8998 | 10551 | 1647 |
| Icelandic | 8398 | 9951 | 1647 |
| Italian | 9250 | 10993 | 1756 |
| Italian2 | 9109 | 10866 | 1770 |
| Jacalteco | 12698 | 14863 | 2185 |
| Kekchi | 12999 | 15219 | 2241 |
| Korean | 9889 | 11211 | 1458 |
| Macedonian | 17170 | 20580 | 3425 |
| Mam | 9934 | 11865 | 2029 |
| Mam2 | 12417 | 14675 | 2277 |
| Maori | 9313 | 11529 | 2310 |
| Norwegian | 8418 | 10185 | 1782 |
| Norwegian2 | 9212 | 11093 | 1975 |
| NT Greek | 8473 | 9936 | 1557 |
| NT Greek2 | 8525 | 10006 | 1575 |
| Náhuatl | 12730 | 14654 | 1944 |
| Platdeutsch | 9085 | 10979 | 1906 |
| Portuguese | 9195 | 10920 | 1774 |
| Portuguese2 | 8576 | 10254 | 1692 |
| Quiché | 13131 | 15828 | 2784 |
| Romanian | 9125 | 10814 | 1783 |
| Russian | 7894 | 9236 | 1436 |
| Russian2 | 8503 | 10010 | 1600 |
| Slovenian | 8181 | 9647 | 1559 |
| Spanish | 10153 | 12223 | 2112 |
| Spanish2 | 10517 | 12562 | 2094 |
| Swahili | 9219 | 10734 | 1524 |
| Swedish | 8986 | 10755 | 1782 |
| Swedish2 | 8968 | 10653 | 1779 |
| Tagalog | 11015 | 13063 | 2058 |
| Ukrainian | 8029 | 9442 | 1507 |
| Uspanteco | 12290 | 14496 | 2226 |
| Vietnamese | 8169 | 10081 | 2006 |

...now you just have to put that into Excel and do some tricks with it.




[Edited at 2010-06-14 14:17 GMT]


Direct link Reply with quote
 

David Wright  Identity Verified
Austria
Local time: 23:02
German to English
+ ...
Interesting statistic Jun 14, 2010

but does anyone know why Macedonian is so much more than most others?

Direct link Reply with quote
 

Jack Doughty  Identity Verified
United Kingdom
Local time: 22:02
Member (2000)
Russian to English
+ ...
Russian and English Jun 14, 2010

My own experience, from comparison of my own word counts, is that the Russian word count is only about 70% of the English, though this may vary according to what type of text it is. I put this down to the fact that Russian has no definite or indefinite articles, uses word endings in some cases where English uses prepositions, and runs words together ("portmanteau" words) as German does, though to a lesser extent. I charge different rates for target text and source text word counts on this basis.
On the other hand, I remember a native Russian speaker saying in one of the forums several years ago that his Russian word counts were higher than his English.


Direct link Reply with quote
 

Henry Hinds  Identity Verified
United States
Local time: 15:02
English to Spanish
+ ...
Varies also by subject Jun 14, 2010

Working between English and Spanish, as a general rule I can say that Spanish is longer than English, BUT this can vary quite a bit by subject. I have found many texts, espevcially legal. where the word count comes out virually the same.

Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 00:02
Member (2003)
Finnish to German
+ ...
Depends how you count Jun 14, 2010

Do you mean word count? Then English is rather average, some languages use more, others (much) less words.
If you count characters, then English is rather short. Most Western languages use more characters for a given subject.

But I cannot understand how you would use this kind of statistic for QA. The output depends on the style of the translator. It has nothing to do with quality.

Regards
Heinrich


Direct link Reply with quote
 
xxxrma
TOPIC STARTER
great table, good idea, but no linguistic research on this matter??? Jun 15, 2010

The testing is for analyzing the language before the actual translation, so that I know how long the translation in a given language is, in order to have an idea about how much space the translation would need. I'm not trying to test the translations. This is impossible to automate.

@Samuel: thanks for your table. Actually your idea with the universally translated text is very good.

What other universally translated text would you recommend to analyze (with URLs please)?

Do you know any linguistic research on this matter? Where should I look? I don't want to spend a lot of money on useless books. Do you have any ideas about where I should look for a linguistic research on language length in general?

Thanks. All the best.

[Edited at 2010-06-15 10:19 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Language length in percentage

Advanced search







WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search