Help needed
Thread poster: Dijana Evans

Dijana Evans
United Kingdom
Local time: 07:41
Member (2009)
English to Croatian
+ ...
Oct 17, 2014

We are taking on a project for the translation of bank statements, which are in scanned PDF format.

It would be helpful if we could have them converted to DOC with the formatting matching the originals as closely as possible, but I'm a bit wary of advertizing for the job due to privacy concerns.

Can anyone provide any tips, or point me in the direction of a reliable DTP expert (by PM if necessary)?

Any help would be enormously appreciated.


Direct link Reply with quote
 

Henry Hinds  Identity Verified
United States
Local time: 00:41
English to Spanish
+ ...
OCR Oct 17, 2014

You can use an OCR (Optical Character Recognition) program, but often you end up with a mess that is more trouble than it's worth. You must use a program intended for the source language. The alternative is to re-create the original format as close as may be practical, or merely just so it can be understood. You should charge extra for working with difficult formats because they can be very time-consuming.

Direct link Reply with quote
 

Dijana Evans
United Kingdom
Local time: 07:41
Member (2009)
English to Croatian
+ ...
TOPIC STARTER
OCR Oct 17, 2014

To be honest, I would rather steer clear of OCR. The scan quality on some of these documents isn't that crisp. This is a legal client and we cannot afford a single mistake. What I would ideally like is someone with a bit of time on their hands who is willing to take some money to manually recreate all the pages, albeit by copying and pasting parts that are the same (headers, etc).

Direct link Reply with quote
 
Joakim Braun  Identity Verified
Sweden
Local time: 08:41
German to Swedish
+ ...
Acrobat Oct 17, 2014

The built-in OCR in Acrobat Pro is quite good.

If these statements are all structured in the same way, setting up a couple of Word stylesheets (and headers/footers for repeated text) won't be too laborious. You could even use Indesign, which gives you much more sophisticated control.

And yes, if you use OCR you need to manually proofread every word and number...


Direct link Reply with quote
 

Rodrigo Castillo H.
Chile
Local time: 03:41
English to Spanish
A few tips Oct 17, 2014

Ehm as a DTP specialist, I can tell you that there's no reliable way to convert from PDF to Word, especially if the PDF contains a lot of tables or complex formatting (indents, bullets...). If you need to convert to Word, I'd just use OCR on the original PDF, but would export the result as plain text. It's usually easier and faster to reapply formatting to a whole plain text document than to fix badly formatted OCR files.
Another workflow would be to use OCR to generate a plain text file, translate that file, and then DTP the final translation (either in Word or InDesign or what have you).
If you need any assistance, don't hesitate to contact me


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Laureana Pavon[Call to this topic]

You can also contact site staff by submitting a support request »

Help needed

Advanced search






Across v6.3
Translation Toolkit and Sales Potential under One Roof

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership. The new online network for Across users assists you in exploring new sales potential and generating revenue.

More info »
SDL Trados Studio 2017 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2017 helps translators increase translation productivity whilst ensuring quality. Combining translation memory, terminology management and machine translation in one simple and easy-to-use environment.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search