suitable OCR
Thread poster: telefpro

telefpro
Local time: 00:28
Portuguese to English
+ ...
Sep 18, 2008

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.

Direct link Reply with quote
 

Martin Skara, PhD.  Identity Verified
Slovakia
Local time: 20:58
French to Slovak
+ ...
only ABBYY Fine REader Sep 18, 2008

is the best solution for conversions PDF-DOC/RTF.

https://abbyy.asknet.com/cgi-bin/dlreg/ml=EN?ID=FRP9DEMOM


Good luck
Martin


Direct link Reply with quote
 
xxxmediamatrix
Local time: 14:58
Spanish to English
+ ...
The world's top OCR system ... Sep 18, 2008

telefpro wrote:

I have a document typewritten in 1957 and it is converted into PDF. The printouts cannot be read. Is there any way to read this document better, using some OCR. It is in French.Please advise.


... is the human eye.

If you can't read the document, then no OCR software will be able to either.

MediaMatrix


Direct link Reply with quote
 
esperantisto  Identity Verified
Local time: 21:58
Member (2006)
English to Russian
+ ...
Precisely! Sep 18, 2008

mediamatrix wrote:
If you can't read the document, then no OCR software will be able to either.


Absolutely right!

However, theoretically, one might convert a PDF into a set of graphic files such as TIFF and try to play with gamma/color correction. But that's alchemy, not an exact science


Direct link Reply with quote
 

Jack Doughty  Identity Verified
United Kingdom
Local time: 19:58
Member (2000)
Russian to English
+ ...
Zoom it? Sep 18, 2008

With Adobe Acrobat, even if you only have Adobe Acrobat Reader, you can zoom the page out to show the details a lot larger. If this doesn't help, I have no idea what else would.

Direct link Reply with quote
 

Anna Sylvia Villegas Carvallo
Mexico
Local time: 13:58
English to Spanish
This should do the trick Sep 18, 2008

Though it's in Spanish, you'll certainly be able to understand if you have Office 2003. Click on the link below.

You have your own OCR


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 14:58
English to French
+ ...
I beg to differ Sep 18, 2008

mediamatrix wrote:

If you can't read the document, then no OCR software will be able to either.


I have successfully displayed on screen text that didn't even seem to be there. My OCR software is OmniPage, and it has a built-in function to enhance images before starting the recognition on it (if you know what you are doing, you can also do this with graphic programs like Photoshop).

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.

[Edited at 2008-09-19 15:22]


Direct link Reply with quote
 
xxxmediamatrix
Local time: 14:58
Spanish to English
+ ...
QA? Sep 18, 2008

Viktoria Gimbe wrote:

There are documents that cannot be read by the human eye that can be made readable with the help of software. If I was in telefpro's situation, I would try it.


And how do you propose that telefpro should go about validating the OCR output? Even with texts that are easily human-readable, no OCR software is ever 100% accurate. If telefpro can't read the source text is it reasonable to assume that the output from image-enhanced OCR will, by some miracle, be 100% reliable on this particular occasion?

Telefpro is up against a fundamental law of entropy here.

MediaMatrix


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 14:58
English to French
+ ...
That's the part his/her intelligence is needed for Sep 18, 2008

mediamatrix wrote:

And how do you propose that telefpro should go about validating the OCR output?


Well, s/he can read, right? Once s/he gets the OCR input, s/he can read through it to decide whether the output is strong enough to be processed. Isn't that what we should be doing even with texts that are already in an editable format?

The point of telefpro's question is to make the text readable by the human eye, not to turn it into editable text (although s/he may ultimately be interested in that as well). What other method do you propose? I don't know of any other solution. It's a matter of using technology to enhance what is humanly feasible.

In some cases, like the present one, technology can go much farther than the human brain - although this is usually not the case.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

suitable OCR

Advanced search






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Across v6.3
Translation Toolkit and Sales Potential under One Roof

Apart from features that enable you to translate more efficiently, the new Across Translator Edition v6.3 comprises your crossMarket membership. The new online network for Across users assists you in exploring new sales potential and generating revenue.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs