Extracting text from PDF files.
Thread poster: Alexander Chisholm

Alexander Chisholm  Identity Verified
Local time: 18:55
Italian to English
+ ...
Nov 19, 2002

Does anyone know of a utility for extracting text from ,pdf files. It can even be unformatted etc. as long as the actual text is there.

Direct link Reply with quote
 

Evert DELOOF-SYS  Identity Verified
Belgium
Local time: 18:55
Member
English to Dutch
+ ...
www.pdfcount.com Nov 19, 2002

is a very handy tool for this.

Or try Acrobat Adobe 5.0





Good luck!
[addsig]


Direct link Reply with quote
 

Valentina Pecchiar  Identity Verified
Italy
Member
English to Italian
+ ...
Wordfast (www.champollion.net) Nov 19, 2002

Hi

That\'s the tool you need!



I find it terrific, much better at reconstructing lines and paragraph than Acrobat itself! Especially if you select OptimisticPDF in the Pandora Box (options panel).



Rainbow and rain, though, it\'s a CAT tool, and it performs lots of other functions as well. Threfore it can be a bit confusing if it\'s the first time you see it.



The free trial version will do the job just as smoothly as the commercial version.



Have fun (and Wordfast)



Direct link Reply with quote
 

Judy Rojas  Identity Verified
Chile
Local time: 12:55
Spanish to English
+ ...
Try Omnipage 12 Nov 19, 2002

That is the tool I use. It can be a bit expensive, but if you translate a lot of PDF files its worth every penny.



Omnipage keeps the original fonts and layout of the document.



HIH


Direct link Reply with quote
 

Frank Bremster  Identity Verified
Local time: 19:55
German
+ ...
Finereader 6.0 Nov 19, 2002

Does the same as Omnipage. Keeps everything as it is. Also very expensive (somewhere around $700), but you can download a 30 day trial, which works perfectly.



Everything else, except for Omni and Finereader will not give you 100% Formats and pics. I\'ve tried em all!!



See www.abbyy.com


Direct link Reply with quote
 

Arnaud HERVE  Identity Verified
France
Local time: 18:55
English to French
+ ...
Acrobat Nov 20, 2002

Buy Acrobat.

Thus you have a lot of other pdf functions as well.


Direct link Reply with quote
 

ashi
United States
Local time: 09:55
English to Hebrew
+ ...
Adobe Illustrator 10 Nov 20, 2002

From Adobe.com

Direct link Reply with quote
 

Alexander Chisholm  Identity Verified
Local time: 18:55
Italian to English
+ ...
TOPIC STARTER
Thanks for the suggestions. Nov 20, 2002

I had never thought of trying to get an OCR program to extract the text, I may give that a try. I have a version of Finereader (it came with my scanner) but I fear it may be a lite version and not have that option, mind you I haven\'t thought of trying.



Otherwise, I have another full version OCR program I can try. I\'ll also keep trying the other options mentioned. If all fails, I may bite the bullet and buy acrobat. I don\'t have all that many .pdf files to translate but the numbers are increasing slightly. I have always used a freeware program to create .pdf files, so I had never seriously considered acrobat in tha past.



There is always the slight problem though of the security settings that have been set on the document, or is that something you can get around?


Direct link Reply with quote
 

monitor  Identity Verified
Local time: 18:55
English to German
+ ...
go professional ! Nov 20, 2002

Have a try with the products from iceni.com, they are Gemini Solo and Gemini Version 4.

You can download a trial version, with some restrictions.

However, compared to other tools, it seems to me the most reliable I have come across. And with $ 159 the price is more than fair.

Then also bear in mind, that always somebody somewhere has converted the original file into that pdf-document. So it is worth asking for a word or other formated file there first.

Kind Regards

Marcel


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maria Castro[Call to this topic]

You can also contact site staff by submitting a support request »

Extracting text from PDF files.

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs