The best way to convert scanned pdf to Microsoft word
Thread poster: Karolina Petkuviene
Karolina Petkuviene  Identity Verified
Lithuania
Local time: 00:33
English to Lithuanian
+ ...
Oct 19, 2012

I have Abby fine reader 9.0. but I am not satisfied. It takes too much time while formatting the text after converting. Maybe somebody could offer better software? Thank you in advance.

Direct link Reply with quote
 
564354352  Identity Verified
Denmark
Local time: 23:33
Danish to English
+ ...
Latest version of Adobe Acrobat Oct 20, 2012

Apparently, Adobe Acrobat XI, which has just been released, can convert PDF files directly into Word, Excel and Power Point.

I've seen their demo, and it looks excellent. Expensive, but what a time-saver.

[Edited at 2012-10-20 06:05 GMT]


Direct link Reply with quote
 

Tony M  Identity Verified
France
Local time: 23:33
Member
French to English
+ ...
Previous discussions Oct 20, 2012

I think you'll find that this subject has been discussed at some length in previous threads, and I feel you may find some helpful comments there.

I can't say I have very wide experience of different software, but it is my highlmy personal observation that Abbyy does do an extremely good job at the character recognition part.

However, as far as document formatting is concerned, it suffers from the same problem that is surely inevitable with any such program: there is no way it can intelligently determine what the original document formatting was.

Generally, and depending on the exact format of the original document, I find the best solution is to choose the conversion option that attempts the least formatting possible; it is then relatively easy to re-apply the original document formatting manually at the end — depending of course on your customer's requirements; very often, my customers are happy to receive plain text which can then be readily re-formatted by a skilled word-processing operator.

[Edited at 2012-10-20 12:02 GMT]


Direct link Reply with quote
 

V S Rawat
India
Local time: 04:03
English to Hindi
+ ...
Gitte! it was about image to text conversion, not text to text formatting. Oct 20, 2012

Gitte Hovedskov Hansen wrote:

Apparently, Adobe Acrobat XI, which has just been released, can convert PDF files directly into Word, Excel and Power Point.

I've seen their demo, and it looks excellent. Expensive, but what a time-saver.

[Edited at 2012-10-20 06:05 GMT]


Hi Gitte Hovedskov Hansen,

Adobe converts only text pdf files to other formats. Adobe doesn't do OCR to convert text from images to text files. I think o.p. is asking about that. ABBYY is an OCR software.

In any case, if you open a text pdf file in any pdf reading software, you can just select entire text and copy to word or any text editor, if the pdf file is not protected, so that is not an issue.

Thanks.
--
Rawat


Direct link Reply with quote
 

neilmac  Identity Verified
Spain
Local time: 23:33
Spanish to English
+ ...
Two methods Oct 20, 2012

I use SolidConverter (mine is version 4) for converting PDFs into Word and vice versa. I find it very good and easy to use for most things, but it doesn't work with some scanned texts. You can download a free demo version for evaluation here:
http://www.soliddocuments.com/pdf/-to-word-converter/304/1

I've also just acquired a more powerful OCR program (Omnipage) which my colleague says is very good for converting to and from PDF, even scanned ones, but I haven't had the chance to try it out yet.


Direct link Reply with quote
 

Agnes Lenkey  Identity Verified
German to Spanish
+ ...
Previous thread Oct 20, 2012

Hi mairapt,

Here is one of the previous threads about this issue:

http://www.proz.com/forum/software_applications/232135-converting_pdf_files_to_word_advice_needed.html

And here is another one:

http://www.proz.com/forum/software_applications/234410-rotating_a_pdf.html

Best regards,

Agnes


Direct link Reply with quote
 

Siegfried Armbruster  Identity Verified
Germany
Local time: 23:33
Member (2004)
English to German
+ ...
I am happy with Finereader 8 Oct 20, 2012

mairapt wrote:
I have Abby fine reader 9.0. but I am not satisfied. It takes too much time while formatting the text after converting.


Are you using the software optimally, ie. not using the autorecognition for the format but telling the software which parts are text, which parts are tables which parts are images etc.

Finereader 8 gives very good results when you tell the software what to do and how to handle the text/formating. Afterwords there is relatively little work to do in Word, just general adaptation of margins, font/charcter spacing.

I have converted very large documents this way, and found the result always ok. However, if you tell your client that you will charge for your time you need for converting the scanned PDF, you will find that very often they find the original in a format you can process directly.



[Edited at 2012-10-20 09:53 GMT]


Direct link Reply with quote
 
Karolina Petkuviene  Identity Verified
Lithuania
Local time: 00:33
English to Lithuanian
+ ...
TOPIC STARTER
Hello Siegfried Oct 20, 2012

I am using autorecognition and then I have to recheck all the text and to make many changes. Now I have over 100 pages.

Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 00:33
Member (2003)
Finnish to German
+ ...
Consider typing Oct 20, 2012

If conversion gives you bad results and editing of the file takes too much time you could just type the translation into Word while reading from pdf. For long files the use of a typist could be cost effective. S/he would type the source text and you could translate it using your favorite tool.

Direct link Reply with quote
 

Tom in London
United Kingdom
Local time: 22:33
Member (2008)
Italian to English
Not the translator's job Oct 20, 2012

Siegfried Armbruster wrote:

.....if you tell your client that you will charge for your time you need for converting the scanned PDF, you will find that very often the find the original in a format you can process directly.


Precisely. The translator's job is to translate. Not anything else.

If you're given a PDF to translate, just make sure you also deliver your finished translation as a PDF.

That'll teach 'em!

[Edited at 2012-10-20 09:32 GMT]


Direct link Reply with quote
 

Alexander Chisholm  Identity Verified
Local time: 23:33
Italian to English
+ ...
Abbyy Finereader is the best I've tried so far Oct 20, 2012

and it will handle PDF files prepared from scanned files (the vast majority of the work I am given).

I use Build 8. onwards (Mac) and it is fairly fast and very reliable.
The only problem may be formatting with very large documents.
The formatting the program outputs is pretty good, but in my experience, if the text runs over a page, the formatting will be screwed up when you translate sentences crossing a page.
Its very often worthwhile doing some pre-translation "clean-up" to avoid this problem and basically give yourself an uncluttered body of text to translate, and if this is too time time consuming then that's the price you pay. I still think it worthwhile because I can then still use Trados etc. which means I have access to TMs for speed and consistency.

Tom, I understand your point, but if the PO says:
Hand-off - PDF file
Delivery - Word file

then you accept that when you accept the job - basic and pragmatic fact of life.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

The best way to convert scanned pdf to Microsoft word

Advanced search






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search