Word count in PDF?
Thread poster: Inga Petkelyte
Inga Petkelyte
Inga Petkelyte  Identity Verified
Portugal
Local time: 05:11
Lithuanian to Portuguese
+ ...
Apr 30, 2014

Anyone can help on this - I need to check the word count in PDF files; I mean image PDF files, like certificates, diplomas etc. where the texto cannot be copied and pasted in a Word document.
I tried to look through topics here but didn't find what I thought was here in forums.
Anyone with some useful tips on this?


 
Andrea Garfield-Barkworth
Andrea Garfield-Barkworth  Identity Verified
Germany
Local time: 06:11
Member (2015)
German to English
PractiCount & Invoice Apr 30, 2014

Is what I use. It can also count words in PDFs.

 
Translations2u
Translations2u
United Kingdom
Local time: 05:11
Conversion software Apr 30, 2014

Nitro PDF is quite useful although nothing seems to work well for pure images. You can download this software and have a trial period. It allows you to convert files to Word, rtf etc and also edit PDFs. There is also an online conversion site I have used called onlineOCR.net. Maybe these will help. If they don't always produce a fully editable document maintaining the source format, they can at least give you a better idea of the word count in the converted version.

 
Ala Tolos
Ala Tolos
Lithuania
Local time: 07:11
English to Lithuanian
+ ...
Abby Apr 30, 2014

Hi!
I am using Abby Fine Reader (OCR tool) for this purpose, works almost perfectly and also includes Lithuanian. You can use any other OCR tool. After the programme converts the file into Word you can count the words as usually.


 
Tony M
Tony M
France
Local time: 06:11
Member
French to English
+ ...
SITE LOCALIZER
Check out existing posts Apr 30, 2014

This topic has been covered extensively in the past — there is of course no possible way of doing a word count in an image PDF file, other than by converting it first to editable text, e.g. by using an OCR application.

Depending on what the purpose of your word count is, this may or may not be economically worthwhile, in terms of both your time, and the possible investment required.

If the purposes is solely for costing a translation job, then I would be tempted to ch
... See more
This topic has been covered extensively in the past — there is of course no possible way of doing a word count in an image PDF file, other than by converting it first to editable text, e.g. by using an OCR application.

Depending on what the purpose of your word count is, this may or may not be economically worthwhile, in terms of both your time, and the possible investment required.

If the purposes is solely for costing a translation job, then I would be tempted to charge a minimum fixed charge per page — I usually find most certificates come out way below the number of words corresponding to my normal minimum charge, and any excess helps pay for the disproportionate amount of work this type of job usually entails. Of course, if there are several, you could always quote a 'package price' for them all together; since we don't have fixed direct costs 'per word', it is often better to oblige a customer with a price that overall is going to be fair, rather than waste inordinate amounts of time (my most precious resource!) trying to arrive at a word count that is accurate to the last word.

Alternatively, if you find that method too risky, you could simply charge based on a target word count — you are probably already aware of the approximate scaling factor used between your own source and target languages, so that is something you can agree with the customer in advance. For example, in my FR > EN language pair, I know that taken overall, French texts in the kind of fields in which I work tend to be 5–10% longer than the resulting EN translation — so if a source word count isn't available, I simply charge on the basis of the target word count × my normal rate + 10%.

[Modifié le 2014-04-30 15:34 GMT]
Collapse


 
TB CommuniCAT
TB CommuniCAT  Identity Verified
Canada
Local time: 01:11
English to French
Word count in PDF Apr 30, 2014

Just a suggestion... You can try inputing the PDF file in your CAT tool and let it convert to Word document. Then, you can do a word count from there. Hope it works.

 
John Fossey
John Fossey  Identity Verified
Canada
Local time: 01:11
Member (2008)
French to English
+ ...
Some PDF wordcount tools Apr 30, 2014

You can upload a pdf to Wordfast Anywhere at freetm.com and then download the converted Word document.

For pdf documents that contain text you can install a word counter into Acrobat Reader, available from http://abracadabrapdf.net/utilities-in-english/acrobat-utilities/abracadabratools_en/

[Edited at 2014-04-30 15:38 GMT]


 
Inga Petkelyte
Inga Petkelyte  Identity Verified
Portugal
Local time: 05:11
Lithuanian to Portuguese
+ ...
TOPIC STARTER
Confused now Apr 30, 2014

Because of my language combinations, I don't own CAT tools - where there is 1 fixed word in the source language, it may be 7 words with diferente endings in the target language and so far, I haven't seen a CAT tool "knowing" which ending to attach.

And yes, I know this topic has already been covered - that's exactly stated in my original post.

I'll try to use the suggested trial versions, thank you all.
I need this, for I started collaborating with a new agency an
... See more
Because of my language combinations, I don't own CAT tools - where there is 1 fixed word in the source language, it may be 7 words with diferente endings in the target language and so far, I haven't seen a CAT tool "knowing" which ending to attach.

And yes, I know this topic has already been covered - that's exactly stated in my original post.

I'll try to use the suggested trial versions, thank you all.
I need this, for I started collaborating with a new agency and they send me (almost) everything in PDF. Already several times my feeling was that there were far more words than stated (and paid for). Once, I checked and the PDF format allowed to copy-paste into Word. Difference: by more that 30% of the initially declared word count. Anserof the PM: ""I forgot to count the last page". Well, it happens. But with the previous orders and with the one I have now, I would like to be able to double check.
Collapse


 
Tony M
Tony M
France
Local time: 06:11
Member
French to English
+ ...
SITE LOCALIZER
CAT tools for inflected forms Apr 30, 2014

Inga Petkelyte wrote:

Because of my language combinations, I don't own CAT tools - where there is 1 fixed word in the source language, it may be 7 words with diferente endings in the target language and so far, I haven't seen a CAT tool "knowing" which ending to attach.


I do sympathize — this is a problem even in languages with fewer inflexions, but it must be a nightmare with so many!

Just a thought, though — are you suggesting that a source word might have a single root translation, just with inflected endings? If this is the case, then even a standard CAT tool ought to be some help. My language pair is FR > EN, where there isn't too much of a problem, as EN has few inflected forms; still, it is a bit of a nuisance to have to select from even a small number of glossary entries every time. So what I do is teach my glossary just the root of each target word, which coupled with fuzzy source term recognition enables me to usually get the right root, and then only have to type the ending onto it — so for a verb, it might mean typing the -s / -ing / -ed, for example. I realize the usefulness of this depends on the relative lengths of roots vs inflexions in your particular target language; but I realized the usefulness of it when I had had to type (e.g.) "non-steroidal anti-inflammatories" enough times, and realized that a CAT tool could save me that work!


 
Inga Petkelyte
Inga Petkelyte  Identity Verified
Portugal
Local time: 05:11
Lithuanian to Portuguese
+ ...
TOPIC STARTER
Old fashioned Apr 30, 2014

Tony, in such cases, I use an old fashion: I mark a repetitive word or a phrase, say, as CC. Then substitute it by the root(s). And then add endings. One of the most frequent examples in my area is "the present Contract/Agreement". If my target language is my native, I will certainly need to add 6 different endings both to the adjective and noun. So, if I have to add endings any way, I still doubt if I need a CAT tool. Maybe some day I will discover the difference, just like it happened when I s... See more
Tony, in such cases, I use an old fashion: I mark a repetitive word or a phrase, say, as CC. Then substitute it by the root(s). And then add endings. One of the most frequent examples in my area is "the present Contract/Agreement". If my target language is my native, I will certainly need to add 6 different endings both to the adjective and noun. So, if I have to add endings any way, I still doubt if I need a CAT tool. Maybe some day I will discover the difference, just like it happened when I switched from a type writer to the computer typing)Collapse


 
Thomas Rebotier
Thomas Rebotier  Identity Verified
Local time: 22:11
English to French
Words count / CAT tool Apr 30, 2014

The word count is one thing the CAT tool another.

OCR --> MS Word word count works usually well, you just have to check the MS Word that it did not create too many images with text inside (some OCR programs to that in some situations)

Notusing a CAT tool is your loss at this point. CAT does not do word-by-word!!! Even the simple ability to retain 100% matches is precious (sentences integrally repeated). But the most useful is the ability to search particular expressions
... See more
The word count is one thing the CAT tool another.

OCR --> MS Word word count works usually well, you just have to check the MS Word that it did not create too many images with text inside (some OCR programs to that in some situations)

Notusing a CAT tool is your loss at this point. CAT does not do word-by-word!!! Even the simple ability to retain 100% matches is precious (sentences integrally repeated). But the most useful is the ability to search particular expressions to stay consistent, sometimes across projects that are several months apart.
Collapse


 
Catherine MacLaine
Catherine MacLaine
Canada
Local time: 01:11
English to French
+ ...
Open in Word Dec 12, 2018

I just went through the same problem, having to give a quote and count words on a 122 page manual in PDF format, unable to copy and paste any part of the document. This is what worked for me: right click on the doc, go down to Open with, then select Word. Microsoft Word actually converted the entire manual to a perfect, totally editable Word document with the amount of Words at the bottom of the page. Hopefully, this easy trick will work for someone else.
Cat


 
Ilya Razmanov
Ilya Razmanov
Russian Federation
Local time: 08:11
English to Russian
OCR software Dec 13, 2018

Tony M wrote:

Depending on what the purpose of your word count is, this may or may not be economically worthwhile, in terms of both your time, and the possible investment required.


Speaking of investment, some OCR software often get bundled with scanners (as a result, I have several OCR programs I don't use). Even in case it can't read your language, it is still likely to recognize spaces dividing words


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Word count in PDF?






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »