Word count & pay - pdf, word and trados
Arve-Olav Solumsmo

Arve-Olav Solumsmo
Local time: 14:39
Member (2008)
English to Norwegian
+ ...
Jan 11, 2009

It starts in a familiar way - disagreement on word counts and payments. I will talk about word count, but the issue is of course money. The client claims that word count from a PDF file is always 10% less than Trados count, do any of you have that experience?

I receive a contract to do half of a technical manual, I receive a PDF and a statement of the word count involved. I accept the contract and receive extracted formatted word files. I should have become suspicious when the agency stated that words in images would be extra work but included in the pay.

The total was stated as 73370 words. The image file was 6200 words (and included all images, not just my part, but that is a side issue). I started translation in tageditor, but when I did progress checks, I noted that my total was 79952 words, and I made the client aware of this.

The client stated:
"When we bid the project from our customer, we always convert the PDF to MS Word by Acrobat software and then use TRADOS to count the Word file. That is the international rule that you may already know (let's call this method 1). Alternatively, if we create the Word file by copy paste from the PDF file directly (let's call it method 2), and then use TRADOS to count it, then the word count is always about 10% more that the method 1. I guess you know this fact too if you did a lot of translation before. The word count we use for customers and translators is by method 1, to keep our competing advantages."
Now, I am a suspicious kind of person, so I took the PDF file, copied it to word (copy file to clipboard) and ran Trados on the resulting file (their method 2, I believe). Total 87745, which is more like I expected.

Anybody have relevant experience??

[Edited at 2009-01-11 23:24 GMT]

Local time: 08:39
Spanish to English
"Real" word count? Jan 11, 2009

Let me preface my remarks by saying that I have seen a lot of PDF extraction/ OCR processes come up interpreting scratches, dots and other stray marks as separate words, including the "---------" on many Spanish notarized docs counting as six words. Could this be the case with your Trados process?
Having said that, it is not uncommon to have a variance between an original PDF guesstimate and the true count, and I have never had a client or outsourcer protest the "true" count as determined by ME. When major differences (above 1-2%) are discovered, I make every attempt to timely advise the client. If a true count is not to be accepted, end of story, if the work is to be based on a specific count. If a fixed price has been set for the job, than word count is irrelevant. This goes for both images and extractables.
If the client is paying on a per-word basis, then a true count must be used. If it's "I figure this is about 70,000 words, will you do it for $XXXX?", that's a whole 'nuther story.
From what you're describing, it sounds like a sleazy outfit with their thumb on the scale, and not the kind of folks you should soil yourself dealing with.

Marijke Singer
United Kingdom
Local time: 13:39
Dutch to English
+ ...
Agree with Richard Jan 11, 2009

I do a lot of my own pdf conversion ranging from files containing a few thousand words to 100,000 words and never has a customer disputed my word count with Trados.

Marie-Céline GEORG
Local time: 14:39
English to French
+ ...
International rule???? Jan 11, 2009


I always get very suspicious when I heard such things as "this is the international rule"... There is no international rule about translation! The agency may only be trying to pay less.
It's true that both conversion methods give different results, but I've never had the same percentage - it depends very much on the file contents.

Soonthon LUPKITARO(Ph.D.)
Local time: 20:39
Partial member (2004)
English to Thai
+ ...
Clear statement beforehand Jan 12, 2009

To prevent such dispute, I always confirm how to count words before accepting the job:
1. Use MS Word if it is the simplest.
2. USD PDF if it is applicable.
3. Use Trados if it is applicable.
4. Use other CAT tools e.g. PractiCount, SDLX, WordFast.
Please note that each method gives rather different words, and this may start dipute.

Soonthon L.

Kaiya J. Diannen
Member (2008)
German to English
My personal methodology Jan 12, 2009

Arve-Olav Solumsmo wrote: It starts in a familiar way - disagreement on word counts and payments. I will talk about word count, but the issue is of course money. ... The total was stated as 73370 words. ... I noted that my total was 79952 words, and I made the client aware of this. ... Total 87745, which is more like I expected.

Three different figures have already been calculated for this one job. This is exactly why I personally work according to one of two methodologies when PDF files are involved:
1) Payment by target word count. End of discussion.
2) If the client wishes to provide me with extracted (Word) files based on the PDF, I must have the opportunity to review them myself before accepting any word count that is the basis of payment. This gives me not only the opportunity to perform my own word count, but also to assess the quality of the extraction, which can be very relevant to translation at a later stage.

Additionally, I normally won't use Trados word counts as a basis for payment. They are often vastly different than those in Word and they don't include numbers, which in my language pair often need to be adjusted (decimals), and also repositioned in the sentences (i.e. retyped or edited for placement).

I am not really sure what you can do about it after already having accepted the job. It may depend on what stage of the translation you are at, and how friendly and frequent your relations are with this particular agency.

Sergei Leshchinsky
Local time: 15:39
Member (2008)
English to Russian
+ ...
A bit off the topick, but... Jan 12, 2009

... Word itself produces different results for the same text 1) in the status bar on opening; 2) in the menu File/Properties/Statistics...; 3) in the menu Tools/Statistics.

I am with Soonthon and Richard. The actual PDF word count always exceeds the guestimate due to various text-boxes and text pieces in the images (initially assumed/processed as images without text). Even Trados misses some text and you have to check it at the DTP stage (in concerns the PowerPoint presentations as well). There may be some text in the pictures...

