Translating a scanned book in .pdf format
Thread poster: Olga Dyussengaliyeva

Olga Dyussengaliyeva  Identity Verified
Kazakhstan
Local time: 00:39
English to Russian
+ ...
Feb 21, 2012

Dear colleague
I've received a translation order in form of a scanned book in .pdf format.
This doesn't make life easy at all...icon_smile.gif With these regards...
Is there any way to convert the files into MS Word? I really doubt though since this is an image rather than a text...
Is there any way I can upload it to a CAT (Wordfast)?
How can I calculate the words in the text to quote the price for the order in this case?
And is there any additional charges that translators apply if the source document isn't provided in a standard convenient and "CATable" format?
thanks a lot!


 

Tony M  Identity Verified
France
Local time: 21:39
Member
French to English
+ ...
PDF > DOC conversion using OCR Feb 21, 2012

The usual workflow is to convert your PDF (image) file via an Optical Character Recognition application, such as, for example, the highly-regarded Abbyy Fine Reader.

This process has been discussed at great length before in these forums, and I suggest you try a search on key words like PDF + OCR to access the wealth of information already available.

Some of the more sophisticated programs will automatically handle multiple pages, etc., and can produce quite fair facsimiles of the original document; however, for subsequent CAT processing, I've found it's better to use the 'unformatted text' option, in order to avoid all sorts of hassles with the CAT tool not liking the 'tricks' the program uses to try and reproduce the formatting.

As for extra charges, I do reserve the right to make an additional charge for the pre-processing of non-standard texts — but I rarely apply it!

However, if you have to go out of your way to invest in special software for this sort of job, you may feel you'd like to make a small charge in order to try and recoup your investment — but beware of being undercut by those who don't!

[Modifié le 2012-02-21 15:23 GMT]


 

isabel murillo  Identity Verified
Local time: 21:39
English to Spanish
+ ...
always PDF or hardcopy/CAT tool for a book??? Feb 21, 2012

I receive almost all the books I translate in .pdf format or in hardcopy. I think this is more or less the standard in the publishing business. Authors don't use to deliver their work to the world on an electronic file. So I don't see the need, or the sense, to apply an extra charge for this kind of source document.

Regarding rate calculation, at least in Spain and the majority of European countries, the rate is agreed through a contract and is based on translated characters (no source word count)

Are you seriously thinking on translate a book with a CAT tool? Are you talking about a technical manual or a "book" (fiction or non fiction)?


 

Olga Dyussengaliyeva  Identity Verified
Kazakhstan
Local time: 00:39
English to Russian
+ ...
TOPIC STARTER
technical text Feb 21, 2012

isabel murillo wrote:
Are you seriously thinking on translate a book with a CAT tool? Are you talking about a technical manual or a "book" (fiction or non fiction)?




It is a technical text, Technical Code actually
Well, I try to translate my technical texts with a CAT since it is a good practice of maintaining glossary and TM, besides, technical texts normally have many repeating parts making a TM quite handy. Normally I receive source texts in electronic format, so this issue is a new experience for me.
So how do you deal with the .pdf files then?
Thanks a lot for the response!


 

Olga Dyussengaliyeva  Identity Verified
Kazakhstan
Local time: 00:39
English to Russian
+ ...
TOPIC STARTER
Thanks, Tony Feb 21, 2012

Your response made some things clear. Thanks!

 

Mark Hamlen  Identity Verified
France
Local time: 21:39
Member (2010)
French to English
+ ...
Another option Feb 21, 2012

I've never bought Abby Fine Reader, but I often use this site: http://www.onlineocr.net

It converts files very well. You need to pay a credit in, but I haven't used all of my 25 euro credit in even a year. It's very cheap and effective.


 

Giles Watson  Identity Verified
Italy
Local time: 21:39
Italian to English
Is the PDF an export from InDesign or some other DTP format? Feb 21, 2012

isabel murillo wrote:

I receive almost all the books I translate in .pdf format or in hardcopy.



It depends on the book but if you ask, publishers can often give you an InDesign .idml file or similar, which you can then translate directly with a CAT. This will take care of - most of - the formatting and save the copy editor's time. The problem is that the publishing people who deal with translators are often not really aware of what can and cannot be done with translation memory software and DTP programs. The upshot is that you don't achieve the optimum workflow with the tools available.



Are you seriously thinking on translate a book with a CAT tool? Are you talking about a technical manual or a "book" (fiction or non fiction)?



Not "thinking" about it so much as actually doing it.

Most of my book translations have been done with CATs and access to my tranlation memories and termbases has been of great benefit. In any case, if you are comfortable with your CAT of choice, there is not a lot of difference between translating a DTP application file in, say, Studio and a .docx file in Word, except that the output from the former will be more use to the publisher. The time-savings alone over a plain vanilla paper-and-'puter approach can generate value (i.e., better rates).


 

Elizabeth Joy Pitt de Morales  Identity Verified
Local time: 21:39
Member (2007)
Spanish to English
+ ...
Wordfast Anywhere Feb 21, 2012

Hi Olga,

If it's a scanned pdf, you can upload it to Wordfast Anywhere (www.freetm.com) and convert it to a Word doc. You can then translate it right there on-line, or download and translate with Wordfast Classic or Pro.

I do those kinds of projects all the time and they're no problem.


-Liz


 

Olga Dyussengaliyeva  Identity Verified
Kazakhstan
Local time: 00:39
English to Russian
+ ...
TOPIC STARTER
Wordfast really can do it? Feb 21, 2012

Elizabeth Joy Pitt de Morales wrote:

Hi Olga,

If it's a scanned pdf, you can upload it to Wordfast Anywhere (www.freetm.com) and convert it to a Word doc. You can then translate it right there on-line, or download and translate with Wordfast Classic or Pro.

I do those kinds of projects all the time and they're no problem.


-Liz

In fact I do use WFA a lot; great news that it can handle the .pdf!


 

Dominique Pivard  Identity Verified
Local time: 22:39
Finnish to French
Size might be an issue Feb 22, 2012

Elizabeth Joy Pitt de Morales wrote:
If it's a scanned pdf, you can upload it to Wordfast Anywhere (www.freetm.com) and convert it to a Word doc.

Sure, see http://youtu.be/ZwYgFbWzpFQ
However, Olga may have a size problem with an entire book, as I think Wordfast Anywhere won't accept PDF's of any size. It's possible to break a PDF into smaller chunks, but that can be fastidious if the PDF is very large.

Elizabeth Joy Pitt de Morales wrote:
You can then translate it right there on-line, or download and translate with Wordfast Classic or Pro.

You can even translate it with competing tools, or with no tool at all, since what you get is a standard RTF file you can open in Word.


 

Olga Dyussengaliyeva  Identity Verified
Kazakhstan
Local time: 00:39
English to Russian
+ ...
TOPIC STARTER
Useful video! Feb 22, 2012

thanks for the video, Cominique, it's useful as always.
Could you approximate how many pdf pages can be uploaded to WFA?
one of my documents worked, however uploading of another failed; maybe too big.
Thanks!


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translating a scanned book in .pdf format

Advanced search






memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search