Scanned PDF Files
Thread poster: Trevor Chichester

Trevor Chichester  Identity Verified
United States
Local time: 09:44
Member (2012)
German to English
+ ...
May 17, 2012

Good Afternoon All!

So...I was wondering, what's the percentage of scanned pdf's you guys do a year?

Strangely, more and more of my translations have been from dead pdf's. Right now, I'm working on 13K worth of dead pdfs and to be honest it is QUITE the headache to deal with this file format.

How do you guys combat this? Do you re-write the pdf? Or do you have an OCR converter?

I personally have a great OCR converter but that doesn't mean I don't have to wade through the entire file looking for errors before putting it into Trados.

How do you guys deal with these files?



Cheers,

Trev


Direct link Reply with quote
 

Paulo Eduardo - Pro Knowledge  Identity Verified
Brazil
Local time: 11:44
Member (2008)
Portuguese to English
+ ...
have fun! May 17, 2012

www.freepdfconvert.com/

www.pdfonline.com/

www.freepdfconvert.com/pdf_converter_desktop.asp


Direct link Reply with quote
 

Giles Watson  Identity Verified
Italy
Local time: 15:44
Italian to English
Money talks May 17, 2012

Trevor Chichester wrote:

How do you guys deal with these files?



By quoting a hefty (at least 30%) premium for working with them.

In practice, though, I don't do any. The client either comes up with a viable file format or goes elsewhere. I know plenty of translators who are quite happy to deal with scanned images but I'm not one of them.


Direct link Reply with quote
 

Nikita Kobrin  Identity Verified
Lithuania
Local time: 16:44
Member (2010)
English to Russian
+ ...
* May 17, 2012

Trevor Chichester wrote:
How do you guys deal with these files?


1) I ask the client to convert the PDF file into editable format (MS Word) and send it to me for translation (I accept only those converted files that are 100% identical to the PDF files from which they were converted).

2) If the client is not able to do 100% identical conversion himself I ask my DTP operator to do the conversion. In order to be able to compensate his work I charge the client extra. It's not cheap: in difficult cases the cost of conversion my be equal to the cost of translation.

Nikita Kobrin

[Edited at 2012-05-17 20:26 GMT]


Direct link Reply with quote
 

Anton Konashenok  Identity Verified
Czech Republic
Local time: 15:44
English to Russian
+ ...
Just OCR it, but do it properly May 17, 2012

Nikita, your DTP operator seems to be overcharging you by a huge factor. In my own experience, OCRing a scanned text of decent quality (maybe even a good fax) has never taken me more than 10% of the time needed for translation, and I consider it good customer relations to offer it free of charge if a steady client sends me an occasional scanned document.
There is, however, an important point to remember: never run your OCR in fully automatic mode, nor allow it to format the paragraphs for you. I'm using FineReader, defining the recognition areas by hand (selecting text or table as appropriate) and saving the results as plain text. For very clear originals, I may decide to save as formatted text instead, but delete all paragraph styles created by FineReader before doing any further work - this way, I only keep character-level formatting (font size and bold/italic/underline). Recreating the necessary paragraph format by hand takes a small fraction of the time needed to straighten out the automatically generated formatting.


Direct link Reply with quote
 

Nadezhda & Vatslav Yehurnovy  Identity Verified
Ukraine
Local time: 16:44
Member (2008)
English to Russian
+ ...
Pricing is often NOT meant to do OCRing May 18, 2012

We also have a friend who sometimes helps with OCRing and deep DTP wizardry, but completely agree with Nikita as for pricing extra per hour. And the originals in Word or other editable and not pre-OCRed formats really start to appear like magic

Well, sometimes miracles do not happen, and so the client pays per hour for re-creating the document versions from a scanned all-tables PDF with several consecutive changes of numbers in the cells.

Anton, how about a scanned 15-page document with numerous hardly legible handwritten memos with arrows etc., full of tables and block diagrams?

We just gave a quote for OCRing, drawing and typing, and received back the great Word file with everything intact, just in 3 hours.


Direct link Reply with quote
 

Rolf Keller
Germany
Local time: 15:44
English to German
Online services vs. confidentiality May 18, 2012



Usage of such online services might compromise the confidentiality.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Scanned PDF Files

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search