Pages in topic: [1 2 3] > |
Can you recommend a PDF converter and/or OCR? Thread poster: Pristine
|
Pristine Local time: 12:00 English to German
I need it mainly for English and German documents. The program should 1) Convert PDF documents into Word or Wordpad files. The OCR should 1) Read German 2) Read English And they should not cost an arm and a leg. Any links to freeware and shareware would be nice but I looked and have not found anything yet. Thanks in advance! Kindly, ... See more I need it mainly for English and German documents. The program should 1) Convert PDF documents into Word or Wordpad files. The OCR should 1) Read German 2) Read English And they should not cost an arm and a leg. Any links to freeware and shareware would be nice but I looked and have not found anything yet. Thanks in advance! Kindly, Pristine ▲ Collapse | | |
Try using the search function in the forums | Oct 16, 2008 |
There is a lot of information available for those who take a few seconds to look. For a QA checklist for documents converted by OCR, take a look on the "How To" tab of my profile. There is a link there titled "Post-processing of OCR text files". | | |
Roberto Bertuol United Kingdom Local time: 19:00 Member (2007) Italian to English + ... PDF to Work converter | Oct 16, 2008 |
Hi, here is a link to the Able2Doc Professional converter: http://www.investintech.com/order_a2d_pro.htm it costs $69.96 and it has an OCR reader...not sure about German... For me it works fine, although it really depends on the quality of the pdf document, i.e. whether it is a scanned document or a word document converted into pdf.. Hope it helps | | |
Rimma Kehr Germany Local time: 20:00 German to English + ... PDF to any Text Tool (Word, FrameMaker, PageMaker, InDesign, etc. | Oct 16, 2008 |
If you have Adobe Illustrator, you can open your PDF-file, then save every page to *.ia or *.eps format. Then you can copy text and paste it in your text tool. If you have Adobe Reader 8.0 or 9.0, you can copy text and paste it in any text format tool. Hope this helps. Rimma | |
|
|
Tomas Forro Poland Local time: 20:00 English to Slovak + ... pdf to word never works well | Oct 16, 2008 |
Hi, I've been trying all different kinds of pdf to word converters, both cheap and expensive ones, and they never work very well. With just text, usually there are only few formatting-related corrections needed, (but this usually works even without converter with simple drag & drop) However, the more complicated original formatting was, the worse for the result of conversion. The very worst are pictures and text over graphical elements. What I would suggest is to find som... See more Hi, I've been trying all different kinds of pdf to word converters, both cheap and expensive ones, and they never work very well. With just text, usually there are only few formatting-related corrections needed, (but this usually works even without converter with simple drag & drop) However, the more complicated original formatting was, the worse for the result of conversion. The very worst are pictures and text over graphical elements. What I would suggest is to find some freeware converters or the ones you get with free trial period and uninstall them when they expire, then get the other one etc. I think there is really no point to invest into any "professional" converters at this stage. Or is any of you guys using pdf2word converter that you can say is really, really good? For freeware converters, simply google words like "pdf to doc download free" or something similar and you'll get plenty of them. As for the OCR, I've been using a couple of years ago ABBYY FineReader and it was absolutely excellent tool for all the languages. Usually I'd get about 97% accuracy of the converted text even from worse quality copies and books. The link for ABBYY: http://finereader.abbyy.com/ ▲ Collapse | | |
Tomas Forro Poland Local time: 20:00 English to Slovak + ... Actually, the latest ABBYY OCR converter has built-in pdf2word, too | Oct 16, 2008 |
It's 150 EURO, but these guys really know what they're doing (in OCR technology - for pdf, well, who knows? ) | | |
Price shouldn't be an issue | Oct 16, 2008 |
I use Nuance OmniPage Pro. It reads German, among many other languages. I know, it is the most expensive of all such software - but considering the return on investment and the amount of time it helps you to save (which, of course, depends on usage), it is worth every penny. When looking for a tool, I don't think it is so important to watch the price. If you consider investing money into something that will help boost your productivity, that already means that the tool in question w... See more I use Nuance OmniPage Pro. It reads German, among many other languages. I know, it is the most expensive of all such software - but considering the return on investment and the amount of time it helps you to save (which, of course, depends on usage), it is worth every penny. When looking for a tool, I don't think it is so important to watch the price. If you consider investing money into something that will help boost your productivity, that already means that the tool in question will put money in your pocket. So the couple hundred dollars difference isn't an issue, in my opinion. ▲ Collapse | | |
Another approach | Oct 16, 2008 |
If you need it more to preserve formatting while editing, than to use CAT tools, have a look at this one: http://www.iceni.com/infix.htm . It allows you to actually edit PDF files. | |
|
|
Allesklar Australia Local time: 03:30 English to German + ... PDFConverter 4 | Oct 17, 2008 |
I use PDF Converter 4 for English and German texts as well as Infix, when it's not a scanned document and preserving the formatting is important, as José mentioned. The PDF Converter is not ideal for poor quality scans, but works well enough for most things I am getting. I tried the demo versions of Abby and OmniPage one or two years ago and wasn't that impressed, so I went for the budget tool. Maybe I should have another look at them. | | |
achisholm United Kingdom Local time: 19:00 Italian to English + ... Not all PDF files are the same | Oct 17, 2008 |
Many PDF "converters" just capture the text in a PDF and make it available for other uses - rather like using the text selection tool "|" in Acrobat and copying to the clipboard, only a bit more sophisticated. Unfortunately, many of the PDF files I work with are graphics files, i.e. PDFs produced from a scanned image. These are the typical methods used by the EU offices and regulatory bodies to store the documents submitted to them. Hence, these files don't contain any text to be co... See more Many PDF "converters" just capture the text in a PDF and make it available for other uses - rather like using the text selection tool "|" in Acrobat and copying to the clipboard, only a bit more sophisticated. Unfortunately, many of the PDF files I work with are graphics files, i.e. PDFs produced from a scanned image. These are the typical methods used by the EU offices and regulatory bodies to store the documents submitted to them. Hence, these files don't contain any text to be converted. The only way to deal with such files is to OCR the images. This is why I prefer to use OCR software for this type of task. I currently use OmniPage 16, although I have used FireReader in the past, and both give good results. Recognition accuracy is high, the capture language can be selected and spelling in the desired language checked. Hope this helps. ▲ Collapse | | |
Pristine Local time: 12:00 English to German TOPIC STARTER PDF converter and OCR | Oct 17, 2008 |
Thanks to all of you for your kind responses and advice. Best regards, Pristine | | |
Oleksandr Ivanov Ukraine Local time: 21:00 Member (2008) English to Ukrainian + ... A nice tool from ABBYY (PDF Transformer 2.0) | Oct 17, 2008 |
Alexander Chisholm wrote: Unfortunately, many of the PDF files I work with are graphics files, i.e. PDFs produced from a scanned image. These are the typical methods used by the EU offices and regulatory bodies to store the documents submitted to them. Hence, these files don't contain any text to be converted. The only way to deal with such files is to OCR the images. This is why I prefer to use OCR software for this type of task. It is exactly for this reason that I use PDF Transformer 2.0 from ABBYY. It lets you process PDF files as either texts, or scanned images and converts the output into an RTF or XLS format (although it gives an RTF file a DOC extension, which I find a bit misleading). It also lets you choose the areas to convert (three different area types: text, table or image). It is relatively cheap (I bought mine for USD 30 two years ago). It does not put paragraph marks or line breaks at the line ends within paragraphs. You also can choose from a number of languages for the output file (almost all EU languages, Russian, Ukrainian, Turkish and Kurdish). | |
|
|
Nuance PDF Converter | Oct 18, 2008 |
Imho, good PDF-converters does not exist. They are more or less able to satisfy your needs, but nothing else. And, if you are a translator you should take in account that not MS-Word based CATS will read all remaining rogue codes in your converted documents. Segmentation and propagation will be troubled too, due to these rogue codes. My first choice will be Nuance PDF c... See more Imho, good PDF-converters does not exist. They are more or less able to satisfy your needs, but nothing else. And, if you are a translator you should take in account that not MS-Word based CATS will read all remaining rogue codes in your converted documents. Segmentation and propagation will be troubled too, due to these rogue codes. My first choice will be Nuance PDF converter: http://www.nuance.com/pdfconverter/ The main reason for that, is that it has a file text OCR built in , that allows to read text in graphical formats like bmp, other pdf-converters don't. And format maintaining is quite good in not too involved formats. Pricing: https://www.nuancestore.com/dr/sat5/ec_MAIN.Entry11?SP=10034&PN=0&xid=19198&trackingid=view-quickbuy&CUR=840 You may download a free 30 days trial version to check it. http://www.nuance.com/pdfconverter/trial/spectrum/ Take in account, it is not a production too., Some pages won't be converted or will have a watermark. My second choice will be SolidPDFConverter. http://www.soliddocuments.com/products.htm?product=SolidConverterPDF It hasn't a built in file text OCR, but real page conversion allows to maintain the format quite impressive. The problem: MS-Word does not read text in text boxes. But, this issue can be easy overcomed with a werecat macro http://www.volny.cz/ddaduc/werecat.html. ▲ Collapse | | |
Price and quality | Oct 18, 2008 |
Viktoria Gimbe wrote: I use Nuance OmniPage Pro. I know, it is the most expensive of all such software - but considering the return on investment and the amount of time it helps you to save (which, of course, depends on usage), it is worth every penny. I agree that price is not a main issue here, but results are. I have used Nuance OmniPage pro and PDF converter pro for several years. I am not that impressed about their results. I have, however, heard several praises for Abbyy FineReader. I also notice that there are great differences between the different versions of all the products discussed here. Does anybody know about some independent test of relevant products? It should be something out there; personally I meet pdf-files several times a month. | | |
Tom in London United Kingdom Local time: 19:00 Member (2008) Italian to English If you're going to spend money | Oct 18, 2008 |
... why not go the whole hog and purchase Adobe Acrobat Professional? It's a good investment if you foresee that you'll be working on lots of pdf files. It offers numerous useful functions including OCR, save to Word etc. | | |
Pages in topic: [1 2 3] > |