Urgent advice for word count of pdf-file needed
Thread poster: Annett Hieber

Annett Hieber  Identity Verified
Germany
Local time: 15:28
English to German
Feb 21, 2012

Hi all,

I was approached by a very good end client of mine with a new (and urgent), large translation project. As this is a direct client, he doesn't know anything about word counting methods or the relating problems with pdf-files. The document he sent me is a protected pdf and I don't think he is in possession of the original document. This company has had other pdf-documents before, however, it was always possible to copy the text into a word document and so obtain at least the rough word count.

This client is very important for me and I really like doing business with them. They are always very fair and generous and good payers.

I'm now really wondering how to give them an estimate of the costs for this document??? Is there any possibility beside some OCR software (which I do not have), e.g. can you deduct any percentage from the finished translation (a simple word file) in order to obtain the approx. word count of the original?

Any advice/help would be very much appreciated!

Thank you!

Annett


 

Tom in London
United Kingdom
Local time: 14:28
Member (2008)
Italian to English
Try Feb 21, 2012

Try this pdf to Word converter:

http://www.pdftoword.com/Default.aspx

[Edited at 2012-02-21 14:44 GMT]


 

mariviguijo
Spain
Local time: 15:28
French to Spanish
+ ...
Online OCR + averages Feb 21, 2012

Hi Annett,
I was in the same situation some months ago and the document was very large. I decided to OCR the pages with more text and the ones with less text (try the online OCR www.onlineocr.net), then multiply by the number of corresponding heavy/light pages. The difference between the original word account and the translation word account was less than 5%.

I hope this will help,


 

Vincent Lemma  Identity Verified
Italy
Local time: 15:28
Member (2008)
Italian to English
+ ...
Some options Feb 21, 2012

Hi Annett,

OCR conversion for me is the best in term of actual word count

Otherwise, try pricing per page, taking into consideration a mean price based on type of document at hand.
Consider these facts:
Does it have many graphs, pictures and so forth (placeables)?
Is it double or single spaced?
Does the client need desktop publishing?
Is the text simple to read?

I don't know if Trados can process a pdf file to analyse it or not. You might want to try, or perhaps someone with more knowledge on this issue can help out.

Also try "copy and paste" from pdf, but the text is protected then I am not sure you can do that.

Otherwise, I'd go with a steady price per page based o your experiences.


 

Tom in London
United Kingdom
Local time: 14:28
Member (2008)
Italian to English
Another option Feb 21, 2012

I would also recommend Abby FineReader. I use it all the time and it hasn't let me down yet. Not too expensive if you consider it as a very good future investment!

http://finereader.abbyy.com/

[Edited at 2012-02-21 14:56 GMT]


 

Michael Wetzel  Identity Verified
Germany
Local time: 15:28
German to English
also recommend Abby FineReader Feb 21, 2012

Like Tom, I can recommend Abby FineReader.

There is a free trial version that may or may not be enough to help you with your estimate. I ended up using the trial version for a rare job that unavoidably involved PDFs and was impressed enough that I bought the software.

Sincerely,
Michael

[Edited at 2012-02-22 09:07 GMT]


 

LEXpert  Identity Verified
United States
Local time: 08:28
Member (2008)
Croatian to English
+ ...
Some ideas Feb 21, 2012

Can you print it? If so, print it, then scan it and OCR from that. As the original quality seems relatively good from your description, you should be able to get good results from that process. If you don't have OCR software and don't want to buy it, Wordfast Anywhere will OCR an uploaded file, and Wordfast also have a new service where you can e-mail them a file and an automated process will send you back a word count.

If you can't even print the document your choices are to estimate the total by ballpark words per page or similar method, or simply bill the client by target word count, adjusting your usual source rate to account for any word count contraction from this process as well as any hassle from not being able to access the source file except on your computer screen.


can you deduct any percentage from the finished translation (a simple word file) in order to obtain the approx. word count of the original?


Of course - but if the total won't be known until the end anyway, why not just charge them by target word count with the rate adjusted as described above?

[Edited at 2012-02-21 15:31 GMT]


 

Vincent Lemma  Identity Verified
Italy
Local time: 15:28
Member (2008)
Italian to English
+ ...
Abby is good. For more complex work, Ominipages Feb 21, 2012

Abby is a good streamlined software and recommended.

I also purchased Omnipages for DTP of pdf files.
It is a bit much but it really has it all, and OCR is pretty good.
I believe that they are up to Omnipages 18 if you want to have a look.

If you seldomly work with pdfs then go Abby.


 

Alex Lago  Identity Verified
Spain
Local time: 15:28
Member (2009)
English to Spanish
+ ...
WFA Feb 21, 2012

I think ABBY is your best bet and very reasonably priced, but if you don't want to spend anything Wordfast Anywhere does a very good job converting PDFs into Word files.

 

Elizabeth Joy Pitt de Morales  Identity Verified
Local time: 15:28
Member (2007)
Spanish to English
+ ...
Quick fix Feb 21, 2012

For a quick wordcount, do this:

1. Attach document to an email sent to wordcount@wordfast.com.

2. The subject line of the email should only be the source language in the documents (e.g., GERMAN or ENGLISH).

3. Send file.

In a few minutes you'll get an email with a word count that includes all items.

I've used this free service from the folks at Wordfast several times with good results. You can send Word, PDF (scanned and live), Excel, PowerPoint, Image (TIFF, JPG, BMP, PNG), Rich Text (RTF) and plain text (TXT) files.

And I agree with Alex Lago; Wordfast Anywhere's conversion is very good, indeed.

You can use it (www.freetm.com) to convert and download your doc in Word format, or actually do the translation on-line with Wordfast Anywhere.

-Liz


 

Walter Moura  Identity Verified
Brazil
Local time: 10:28
Member
English to Portuguese
Wordfast Anywhere Feb 21, 2012

Hello, Annett.

All suggestions presented by our colleagues are very good. However, they forgot the part you said the document is protected. None of these application will convert protected files. I had the same problem some months ago with some very large documents and here is how I solved it:

First, I assume you can open and read the files. If this is so, go to

http://www.freemypdf.com/

to unprotect the files.

Then, when the files have been downloaded as unprotected .PDF back to you, upload them to Wordfast Anywhere, which will convert them to MSWord format and ask if you want to download them for checking.

Then you can use the MSWord counting function. Or you can use the Wordfast Anywhere word counting option, it is also free.

I hope to have been of help.


Good luck.

[Edited at 2012-02-21 19:44 GMT]


 

LEXpert  Identity Verified
United States
Local time: 08:28
Member (2008)
Croatian to English
+ ...
Protection issues Feb 21, 2012

Walter Moura wrote:

However, they forgot the part you said the document is protected.
Good luck.

[Edited at 2012-02-21 19:44 GMT]


Surely we're not quite that forgetful, Walter.
That's I asked if she could print it and scan the printout. Different levels of protection can be set up in Acrobat, to prevent modification, copying, resaving, extraction, printing, or some combination thereof. Still, to give better advice it would be helpful to know the exact nature/extent of the protection.


 

Annett Hieber  Identity Verified
Germany
Local time: 15:28
English to German
TOPIC STARTER
Thanks to you all! Feb 21, 2012

I wasn't aware of the complexity of this topic and I would like to thank you all for your good advice!

What ultimately did the trick was Walter's suggestion. I am really thankful. However, I should have been more specific: I was able to print the document, though (I've got another document waiting that I cannot print....).

Surely, there is a reason for protection. Regulations, guidelines or other documents must not be altered or misused. In my case, the document is an operation manual about how to obtain a special certification. My client (his institute) is going to apply for this certification (American) and wants to make sure that he and his employees understand everything correctly.

Have a good night's sleep!

Annett


 

Annett Hieber  Identity Verified
Germany
Local time: 15:28
English to German
TOPIC STARTER
Right! Feb 21, 2012

Rudolf Vedo CT wrote:

Walter Moura wrote:

However, they forgot the part you said the document is protected.
Good luck.

[Edited at 2012-02-21 19:44 GMT]


Surely we're not quite that forgetful, Walter.
That's I asked if she could print it and scan the printout. Different levels of protection can be set up in Acrobat, to prevent modification, copying, resaving, extraction, printing, or some combination thereof. Still, to give better advice it would be helpful to know the exact nature/extent of the protection.


Thanks Rudolf,

You are absolutely right here, Rudolf, and I will keep that in mind!

Annett


 

Annett Hieber  Identity Verified
Germany
Local time: 15:28
English to German
TOPIC STARTER
It worked! Feb 21, 2012

Walter Moura wrote:

Hello, Annett.

All suggestions presented by our colleagues are very good. However, they forgot the part you said the document is protected. None of these application will convert protected files. I had the same problem some months ago with some very large documents and here is how I solved it:

First, I assume you can open and read the files. If this is so, go to

http://www.freemypdf.com/

to unprotect the files.

Then, when the files have been downloaded as unprotected .PDF back to you, upload them to Wordfast Anywhere, which will convert them to MSWord format and ask if you want to download them for checking.

Then you can use the MSWord counting function. Or you can use the Wordfast Anywhere word counting option, it is also free.

I hope to have been of help.


Good luck.

[Edited at 2012-02-21 19:44 GMT]


Hello Walter,

It really worked! Thank you for the good description which even I could understand as a layman in terms of IT-matters.....

Annett


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Urgent advice for word count of pdf-file needed

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL Trados Studio 2017 only €435 / $519
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €435 / $519 / £345 / ¥63000 You will also receive FREE access to Studio 2019 when released.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search