Off topic: If it doesn't say "OCR" then it aint OCR Thread poster: Samuel Murray
| Samuel Murray Netherlands Local time: 18:31 Member (2006) English to Afrikaans + ...
To anyone who might want to buy a PDF converter: If the word "OCR" is not on the web site, then the converter can't do OCR. A somehwat computer illiterate colleague of mine just bought Nuance PDF Converter for $50 and could not understand why she is unable to edit the converted MS Word files. Well, the reason is that the converted Word file is full of little images, with no text. Nuance's PDF Converter faithfully "converted" the PDF, but not in any way that is useful. ... See more To anyone who might want to buy a PDF converter: If the word "OCR" is not on the web site, then the converter can't do OCR. A somehwat computer illiterate colleague of mine just bought Nuance PDF Converter for $50 and could not understand why she is unable to edit the converted MS Word files. Well, the reason is that the converted Word file is full of little images, with no text. Nuance's PDF Converter faithfully "converted" the PDF, but not in any way that is useful. My personal opinion (which is often not shared by software developers or their marketing departments) is that if a user may reasonably have a certain expectation about a product, which is not available in that paricular version, it should be clearly stated on the web site or in the product description. For PDF converters, I think "we don't do OCR" should be mandatory. ▲ Collapse | | | Angela Dickson (X) United Kingdom Local time: 17:31 French to English + ... on the other hand... | May 8, 2009 |
Samuel Murray wrote: To anyone who might want to buy a PDF converter: If the word "OCR" is not on the web site, then the converter can't do OCR. A somehwat computer illiterate colleague of mine just bought Nuance PDF Converter for $50 and could not understand why she is unable to edit the converted MS Word files. Well, the reason is that the converted Word file is full of little images, with no text. Nuance's PDF Converter faithfully "converted" the PDF, but not in any way that is useful. My personal opinion (which is often not shared by software developers or their marketing departments) is that if a user may reasonably have a certain expectation about a product, which is not available in that paricular version, it should be clearly stated on the web site or in the product description. For PDF converters, I think "we don't do OCR" should be mandatory. I just checked the Abbyy website, and their PDF Transformer product--I'm still using version 1--is not described as OCR, but it does produce a perfectly usable Word document from a text or image PDF, provided the image is reasonably clear (in this respect it does not differ from any other OCR product). The trick is to select the option 'Retain font and font size' instead of 'Retain full page layout' - the latter option can result in the annoying box problem. I'm not familiar with the Nuance product, though. Edited to add link to Abbyy: http://www.pdftransformer.com/
[Edited at 2009-05-08 09:53 GMT] | | | Susan Welsh United States Local time: 12:31 Russian to English + ... ABBYY PDF Transformer/OCR | May 8, 2009 |
Angela Dickson wrote: I just checked the Abbyy website, and their PDF Transformer product--I'm still using version 1--is not described as OCR, but it does produce a perfectly usable Word document from a text or image PDF, provided the image is reasonably clear (in this respect it does not differ from any other OCR product). The trick is to select the option 'Retain font and font size' instead of 'Retain full page layout' - the latter option can result in the annoying box problem. Edited to add link to Abbyy: http://www.pdftransformer.com/ ?? I have the ABBYY PDF Transformer, and it does say it has a limited OCR capability (compared to ABBYY Finereader, which is mainly intended for OCR, I believe). Perhaps it's not in the marketing material but in the user's manual. But it definitely does say that.
[Edited at 2009-05-08 10:22 GMT] | | |
Well, before buying any software, you really ought to try it first and see whether it suits your needs. For example, Abbyy FineReader is so far the only solution that is useful for me, and anything else I have seen so far is not good enough, OCR or no. Of course, FR is not perfect, but the best for now. | |
|
|
Angela Dickson (X) United Kingdom Local time: 17:31 French to English + ...
Yes, Susan, I just looked at the FAQ and OCR is mentioned. I was just suggesting that if you applied Samuel's metric to the Abbyy product, you might rule it out as a solution, when in fact it does the job pretty well. Looking at the description of the Nuance product, I'm surprised it produced the result Samuel says it did, so have downloaded a trial version to see how it copes with the PDF I'm currently working on. Watch this space... | | | Julia_O_K Netherlands Local time: 18:31 English to Russian + ... SolidPdfConverter | May 8, 2009 |
SolidPdfConverter is quite good for lots of graphics and tables. Sometimes even better than Finereader, if the text is well visible! | | | Chamz Germany Local time: 18:31 Romanian to German + ...
Angela Dickson wrote: ... The trick is to select the option 'Retain font and font size' instead of 'Retain full page layout' - the latter option can result in the annoying box problem. Thank you Angela for sharing the trick... it is very helpful. I am also very happy with my FineReader von Abbyy (no commercial). All other programs I've used didn't meet my expectations. greets and a nice weekend! Magda | | | no trial version available? | May 8, 2009 |
I'm not familiar with the Nuance product, but I'm hesitant to pay money for any piece of software that I haven't been able to try out on a trial version just to make sure it does what I need it to do, up to the standards that I need. Specifically in the OCR realm, I've heard many good things about Finereader, but the price tag is a little steep. I'd been using Able2Extract Pro for about a year ($129) but frankly am not satisfied with the quality/resource usage. I've been trialing Re... See more I'm not familiar with the Nuance product, but I'm hesitant to pay money for any piece of software that I haven't been able to try out on a trial version just to make sure it does what I need it to do, up to the standards that I need. Specifically in the OCR realm, I've heard many good things about Finereader, but the price tag is a little steep. I'd been using Able2Extract Pro for about a year ($129) but frankly am not satisfied with the quality/resource usage. I've been trialing ReadIris and think it's much superior in many many ways for the same price. ▲ Collapse | |
|
|
Samuel Murray wrote: For PDF converters, I think "we don't do OCR" should be mandatory. I personally think that a person who wants to do OCR should know what the term means. They also should know the difference between text files and image files. It's as easy as trying to select a portion of the PDF document to see if you can select anything using the text selection tool. If one doesn't know how to use Acrobat Reader, then maybe they should read up on the subject before spending money on things they don't understand. Another thing about the term OCR is that if you want OCR, you should look for the term in the product description. If the term isn't there, then you know the product isn't the right one for you. To me, forcing software developers to clearly state that their product doesn't do OCR is the same as forcing a pet food manufacturer to label their packaging with "not intended for human consumption"... | | |
ViktoriaG wrote: I personally think that a person who wants to do OCR should know what the term means. They also should know the difference between text files and image files. Agree 100%. | | | Samuel Murray Netherlands Local time: 18:31 Member (2006) English to Afrikaans + ... TOPIC STARTER Disagree somewhat | May 21, 2009 |
ViktoriaG wrote: I personally think that a person who wants to do OCR should know what the term means. Yes, but someone who wants to convert a file format should not need to know the technical terms for the specific type of conversion. Few people know that "converting graphics to text" is called "OCR". Also, few people realise that PDF often means graphics. To them, their problem is called "I want to convert PDF to Word". You can't expect such people to know instinctively that there is an acronym "OCR" and what is stands for and that they should be looking for it. | | | They mention "OCR"... | May 21, 2009 |
... only when there is "manual mode"... Otherwise they offer it as a "black box". No need to say it is OCR. It is some "tool converting input to output". | |
|
|
Samuel Murray wrote: To them, their problem is called "I want to convert PDF to Word". That is precisely what I was hinting at. People who work with PDFs should know just what a PDF really is. It is not the term OCR that is misused, underused, overused, etc. It is the term PDF that is not well understood. Besides, OCR is used on an infinite variety of file formats, not just on PDFs. I sometimes wonder why some people think of PDFs as soon as they hear OCR... Knowing what constitutes a PDF document should normally prompt a person to read the available documentation searching for a clue on what kind of PDF the software can convert. I agree that this isn't always clear, so what I do when the information is ambiguous is contact the software editor and pop the question. This always works and I've never bought any software that didn't do exactly what I wanted it to. As most of us know, assuming serves only one goal: to make an ASS out of U and ME. | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » If it doesn't say "OCR" then it aint OCR Protemos translation business management system | Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!
The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.
More info » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |