Converting pdf files
Thread poster: Tracy Byrne
Tracy Byrne
Tracy Byrne  Identity Verified
Spain
Local time: 00:27
German to English
+ ...
Oct 26, 2005

Can DejaVu X Standard version 7 convert pdf files? I don't think so but would like to check before investing in a program that does!

 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 00:27
German to English
+ ...
Not a job for a CAT tool Oct 26, 2005

Converting PDF is a preliminary job, and there are many tools for it.

My favourites: SolidConverter and Abby FineReader 8.0. (In fact, FineReader 8.0 is more flexible, and it can even be used for a PDF with the text scanned from hard copy).

You can do a rough and ready job by copy pasting the text via the clipboard (e.g. into Word), but the resulting format is rotten.

There are other good reasons for using DejaVuX, but it does not convert PDF. And I would s
... See more
Converting PDF is a preliminary job, and there are many tools for it.

My favourites: SolidConverter and Abby FineReader 8.0. (In fact, FineReader 8.0 is more flexible, and it can even be used for a PDF with the text scanned from hard copy).

You can do a rough and ready job by copy pasting the text via the clipboard (e.g. into Word), but the resulting format is rotten.

There are other good reasons for using DejaVuX, but it does not convert PDF. And I would suggest using DVX Professional, not Standard (Standard is the Pro version with its hands tied behind its back).

[Edited at 2005-10-26 17:29]
Collapse


 
Tracy Byrne
Tracy Byrne  Identity Verified
Spain
Local time: 00:27
German to English
+ ...
TOPIC STARTER
PDF converters Oct 26, 2005

[quote]Victor Dewsbery wrote:

Converting PDF is a preliminary job, and there are many tools for it.

My favourites: SolidConverter and Abby FineReader 8.0. (In fact, FineReader 8.0 is more flexible, and it can even be used for a PDF with the text scanned from hard copy).

- Thanks for the tips - I will check them out.

You can do a rough and ready job by copy pasting the text via the clipboard (e.g. into Word), but the resulting format is rotten.

- That's what I used to do but recently I have found that most pdf files I am getting no longer allow me to do it. I have Acrobat Reader 5.1 - perhaps I should download a later version, or could the files be "locked" anyway? I get a lot of contracts from various translation agencies in pdf format - presumably they receive them on paper (law firms don't seem to like computers!) and scan them to email them on to me.

There are other good reasons for using DejaVuX, but it does not convert PDF. And I would suggest using DVX Professional, not Standard (Standard is the Pro version with its hands tied behind its back).



- I've just bought Standard and have yet to learn how to use it, so I think I'll stick with that until I feel confident enough for the "all bells and whistles" version!


 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 00:27
German to English
+ ...
1. Dealing with PDF, 2. Sources for DVX Oct 27, 2005


Victor Dewsbery wrote:
You can do a rough and ready job by copy pasting the text via the clipboard (e.g. into Word), but the resulting format is rotten.


TracyB wrote:
- That's what I used to do but recently I have found that most pdf files I am getting no longer allow me to do it. I have Acrobat Reader 5.1 - perhaps I should download a later version, or could the files be "locked" anyway? I get a lot of contracts from various translation agencies in pdf format - presumably they receive them on paper (law firms don't seem to like computers!) and scan them to email them on to me.


Your inability to copy and paste has nothing to do with the Acrobat Reader version. There are 2 possible reasons:
1. The PDF may be scanned from hard copy (as you suggest)
2. The text may be copy protected (check in Acrobat Reader under File>Document Propertes>Security).

If it is scanned from hard copy you will need an OCR program (FineReader and OmniPage are the market leaders here).
If it is copy protected the best solution is to ask the client for the password. Code cracking utilities will not always work, and I suspect they may be on the shady side of the law anyway.

As for DVX, I suggest you join the Yahoo group [email protected], where you can ask your newbie questions and get prompt help from lots of users (by contrast with ProZ - hardly anyone has even looked at your question here). Another good source is the collection of tips on DV posted by Nelson Laterman at http://www.necco.ca/dv/index.htm


 
Kirill Semenov
Kirill Semenov  Identity Verified
Ukraine
Local time: 01:27
Member (2004)
English to Russian
+ ...
ABBYY PDF Transformer Wizard Oct 27, 2005

It's a stand-alone product from ABBYY which is designed to convert PDFs into Word documents. Fine Reader is also good, but the Transformer Wizard is more compact and specialized.

As for DVX or any other CAT tool, we have to understand that PDF format is, basically, an image, a picture, not a text format, so none of text processing software is able to work with PDFs directly.


 
Victor Dewsbery
Victor Dewsbery  Identity Verified
Germany
Local time: 00:27
German to English
+ ...
ABBYY product comparison Oct 27, 2005

Kirill Semenov wrote:
It's a stand-alone product from ABBYY which is designed to convert PDFs into Word documents. Fine Reader is also good, but the Transformer Wizard is more compact and specialized.
As for DVX or any other CAT tool, we have to understand that PDF format is, basically, an image, a picture, not a text format, so none of text processing software is able to work with PDFs directly.


Kirill, does the ABBYY Transformer Wizard convert scanned text into editable Word format? I thought it was just for PDFs created from computer files with editable content.

Even ABBYY don't seem able to make up their mind on this.
In the comparison table at http://www.abbyy.com/finereader8/?param=45377 they say that it doesn't process scans/images.
In the advertising blurb at http://www.abbyy.com/pdftransformer/ they claim that it does.
In any case, it doesn't claim to be able to OCR text on hard copy.

What is your experience with the program?
I must say that my experience with FineReader 8.0 so far is very good (including the formating of stuff like tables).


 
Kirill Semenov
Kirill Semenov  Identity Verified
Ukraine
Local time: 01:27
Member (2004)
English to Russian
+ ...
Unprotected PDFs only Oct 27, 2005

Dear Victor,

As the name of the software implies, the product is for PDFs only. I've been using it for almost a year - and it's very good, but for other OCR tasks one has to use Fine Reader, of course.
Also, it cannot distill the text from a protected PDF, so you still have to know the password to convert it.

Basically, PDF Transformer Wizard is a `light' version of Fine Reader, designed for PDFs only. Anyway, I believe ABBYY's products are among the best in the
... See more
Dear Victor,

As the name of the software implies, the product is for PDFs only. I've been using it for almost a year - and it's very good, but for other OCR tasks one has to use Fine Reader, of course.
Also, it cannot distill the text from a protected PDF, so you still have to know the password to convert it.

Basically, PDF Transformer Wizard is a `light' version of Fine Reader, designed for PDFs only. Anyway, I believe ABBYY's products are among the best in the OCR field and a very good choice for a translator who has to deal with PDFs or hard copies like faxes, printed texts, etc.
Collapse


 
Kevin Lossner
Kevin Lossner  Identity Verified
Portugal
Local time: 23:27
German to English
+ ...
Converting PDF for use with DVX May 28, 2006

I use FineReader v7 for doing this; it will handle most types of PDF document, even protected ones. I'm told that version 8 respects the Adobe protection features, but I haven't confirmed this by personal experience (is this really the case, Victor?).

After you OCR a PDF document with FineReader or other tools, there are a number of cleanup steps necessary in many cases to avoid "code salad". This is especially true if you save with any sort of text formatting. If you take a look on
... See more
I use FineReader v7 for doing this; it will handle most types of PDF document, even protected ones. I'm told that version 8 respects the Adobe protection features, but I haven't confirmed this by personal experience (is this really the case, Victor?).

After you OCR a PDF document with FineReader or other tools, there are a number of cleanup steps necessary in many cases to avoid "code salad". This is especially true if you save with any sort of text formatting. If you take a look on the Personal tab of my profile page, you'll find a link to a description of how to clean up OCR text (or other converted PDF text) to make it easier to work with in DV or similar tools.
Collapse


 


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Pavel Tsvetkov[Call to this topic]

You can also contact site staff by submitting a support request »

Converting pdf files






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »