Exporting Tables from PDF files
Thread poster: bharg

bharg  Identity Verified
India
Local time: 22:39
French to English
+ ...
Nov 1, 2003

Hi all,

One of my clients has given me a bilingual glossary of about 400 pages as a PDF file. The terms are arranged in a table in 2 distinct columns. I was wondering if I could convert this into a 2-column Excel worksheet which I could then import into Multiterm. I tried saving the PDF file as RTF but it doesn't retain the table format and all terms are just listed one after the other. The PDF is editable text and not a scanned image so I guess there must be some way to extract the table. All help will be appreciated.


Direct link Reply with quote
 

Natalie  Identity Verified
Poland
Local time: 18:09
Member (2002)
English to Russian
+ ...

MODERATOR
Try using good OCR software Nov 1, 2003

For example, FineReader Pro version 6 or higher. If your file is large, then divide it first into smaller parts using full version of Acrobat, otherwise opening file in FineReader would last for ages.

After having opened the file, recognize the text as usually and then choose "Send to Word". 99% of formatting will be saved.


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
Write a macro Nov 1, 2003

I would write a macro in Word-VBA or Perl.
First you could insert a sign like # after every second end-of-paragraph mark and then search and replace until you got a tab separated table.

400 pages might be too much for FineReader and even too much for Word. That's where Perl becomes interesting, it would do it within a few seconds.
HTH,
Harry

[Edited at 2003-11-01 12:04]


Direct link Reply with quote
 

Mónica Machado
United Kingdom
Local time: 17:09
English to Portuguese
+ ...
Fine Reader 7 could be useful Nov 1, 2003

Hello,

Fine Reader 7 could be useful. You can download a trial version for 15 days (serch under Abby). If 400 pages is too much for it, split the document in two. Fine Reader 7 works ok with 270 pages (I have never tried more than that for each doc).

Hope this helps

Regards,
Mónica


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Exporting Tables from PDF files

Advanced search






PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search