word count for scanned PDF files
Thread poster: Tai Fu
| | Tai Fu
Local time: 18:26
Chinese to English
The agency that I am now dealing with says that there could be PDF files submitted by their clients. I have looked at such PDF files before and it turns out they were scanned PDF pages therefore it is impossible to do a word count. The problem is not only you can't do a word count, but the usual trick I have for turning simplified chinese into traditional chinese for easier reading won't work, and CAT tools won't work either. The agency said they wanted to set a target word rate, I said it's impossible to bill at target word rate because there's no way to predict the number of target words until the job is complete, also the last job I did with them there were less than half of target words compared to source characters so that means I get paid next to nothing assuming I follow that scheme. So I told them that I would prefer billing on a per line or per page rate.
Has anyone received this type of files and how do you usually bill your client? Is there an easy way to OCR those scanned pages so I can do things like convert between variants of Chinese and being able to use CAT tools?
| || || |
| | imatahan
Local time: 23:26
English to Portuguese
| You can do per hour/ per page || Jun 6, 2010 |
You can charge per hour or per page.
A page has around X words, depending upon the letter width. You can calculate an average value.
Or you can charge per hour.
I work a lot with this type of files, because I work with legal translations, where you usually have copies of a process parts and no way to have in a digital form but PDF.
We make a an average of the pages count them and charge.
You should try an OCR software, like ABBYY Fine Reader. I don't know if it works for Chinese, but it converts PDF (scanned) files into word (.doc) documents. It might be helpful in your case.
Then you can count the words and translate the document with any CAT Tool.
... are always a pest when it comes to making use of good TMs built up over the years. I always try to OCR them (Omnipage 17 is my current tool), but it is not always good copy, and proofing is often just as time-consuming as typing out the whole thing. Then, of course, you don't get the benefit of an increased TM which will be helpful in the future.
An OCR scan does however let you get a word count. Otherwise, I always charge per target word. I tell the client they will just have to wait until the job is done, and that I will allow a 10% reduction in the number of words, since Norwegian has many more compound words than English, so the word count tends to be higher after translation. Most clients have been happy with this arrangement so far. I do have a few regular customers who I do "guesstimates" for sometimes.
Good luck with the dreaded PDF world!!
| | xxxMaren Paetzo
Local time: 03:26
Italian to German
| word count in graphics || Jun 6, 2010 |
If you are looking for a tool to count graphics... I never tested it, but they announced that the new version of Anycount can even do word count in graphics like JPGs and scanned PDFs.
I'm still using an older version, but it is usefull and working fine, so I think I will soon upgrade to Anycount 7
...more information you find here http://www.translationmanagementsystem.com/word_character_line_count_software.html
| | Shouguang Cao
Local time: 10:26
English to Chinese
| Manual counting || Jun 7, 2010 |
Wrestling with an OCR software just to get the word count can be even more time consuming.
What you can do is to do a rough manual count. You can count the lines, and words in each line and then you can do a simple arithmetic like this:
Words in a line X number of lines X number of pages.
Always works with me!
[修改时间: 2010-06-07 04:44 GMT]
To report site rules violations or get help, contact a site moderator:
You can also contact site staff by submitting a support request »
word count for scanned PDF files
|PerfectIt consistency checker|
|Faster Checking, Greater Accuracy|
PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.
More info »
|SDL MultiTerm 2017|
|Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.|
SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.
More info »