working on PDFs
Thread poster: oliver098
oliver098
oliver098
Local time: 20:40
English to French
Oct 31, 2006

If someone sends you some work in a PDF, what's the best way to work on it? Is it necessary to buy Adobe Acrobat, or is it OK just to have the reader (then, maybe paste it into Word to work on the translation? - but then would they expect it back in a PDF format too?)

 
Natalie
Natalie  Identity Verified
Poland
Local time: 21:40
Member (2002)
English to Russian
+ ...

MODERATOR
SITE LOCALIZER
Hi Oliver Oct 31, 2006

Welcome to ProZ.com!

The problem of working with PDF files has been discussed in the forums many times; please use the forum search (in the upper right corner of forum pages). You will certainly find much useful information.

You may also want to seach the Article Knowledgebase; for example, take a look at this article:
... See more
Welcome to ProZ.com!

The problem of working with PDF files has been discussed in the forums many times; please use the forum search (in the upper right corner of forum pages). You will certainly find much useful information.

You may also want to seach the Article Knowledgebase; for example, take a look at this article:
http://www.proz.com/translation-articles/articles/128/

Regards,
Natalie
P.S. FYI, I have moved this thread to Office Applications forum.
Collapse


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Canada
Local time: 15:40
English to French
+ ...
Easy procedure for working with PDFs Oct 31, 2006

Here's what I do:

I open the PDF and export it to text. This may not work with PDFs that contain graphics or have funky layouts. When you export as text (you can do this with Reader, no need to buy Pro), you get a Notepad document with all of the text in the PDF. However, when exporting from PDF to any other format, you will have lots of carriage returns in your document that will break up your sentences in several pieces - these will have to be removed.

Download a free
... See more
Here's what I do:

I open the PDF and export it to text. This may not work with PDFs that contain graphics or have funky layouts. When you export as text (you can do this with Reader, no need to buy Pro), you get a Notepad document with all of the text in the PDF. However, when exporting from PDF to any other format, you will have lots of carriage returns in your document that will break up your sentences in several pieces - these will have to be removed.

Download a free tool called AuroUnbreak - look it up on Google, you'll find it fast - and use it. First, you'll need to copy and paste your text document you just created into Word, OpenOffice or similar and save it as RichText (.rtf). Then, you load this file in AuroUnbreak and process it. What comes out in the end is a RichText file with no formatting but with all unnecessary carriage returns removed so you can now translate the file using any CAT tool - the sentences will be complete, not broken up.

You can also copy parts of the PDF or the whole PDF and paste directly into a Word or OpenOffice file and then save as RichText. The added benefit here is that the formatting is preserved - bold text in the PDF will stay bold in the Word doc, bullets in the PDF will also be preserved, indents, etc. Then, you process this RichText doc with AutoUnbreak and out comes a doc with unnecessary carriage returns removed but with all formatting preserved.

Now, all you have to do is translate this document - go ahead, use a CAT tool, it will work fine. Just make sure you have the original PDF open while you translate, so you can track the formatting and the rest while you work. This is important, because if for some reason, when you exported to text/copied to Word, some parts of the PDF were not copied and pasted, you will find out about this before it's too late.

Usually, when asked to translate a PDF, you don't have to send the finished translation in PDF. Your contract is about translating text, not desktop publishing. However, it's always good to let the client know beforehand in the quote that you will deliver Word files - that way, they know you will not be fiddling with Acrobat and that it will be up to them to make the final PDF version if required. Besides, if their document is a PDF, this probably means that they have Acrobat Professional (they probably used it to create the translatable document, right?) and thus they will be able to create a PDF from your translation.

Finally, I recommend you ask for a higher rate when working on PDFs. You will spend a little bit of time preparing and checking your document, and you should be compensated for that time. Also, next time, you can offer the client to give you a Word document instead of a PDF - this will save them money too, and it will save you PDF headache

All the best!

[Edited at 2006-10-31 19:07]
Collapse


 
Oliver Walter
Oliver Walter  Identity Verified
United Kingdom
Local time: 20:40
German to English
+ ...
You can make PDFs without Acrobat Oct 31, 2006

In addition to what Natalie and Viktoria wrote, you might like to know this: The open-source office suite OpenOffice.org (a) is free of charge, and (b) can export documents to PDF. What I usually do if I want to make a PDF is:

(1) Create and edit the document with Microsoft Word (because I'm familiar with it). Save the document and exit from Word.
(2) Open the document with OpenOffice Writer (that's the word processor) and use File > Export as PDF.

You can downloa
... See more
In addition to what Natalie and Viktoria wrote, you might like to know this: The open-source office suite OpenOffice.org (a) is free of charge, and (b) can export documents to PDF. What I usually do if I want to make a PDF is:

(1) Create and edit the document with Microsoft Word (because I'm familiar with it). Save the document and exit from Word.
(2) Open the document with OpenOffice Writer (that's the word processor) and use File > Export as PDF.

You can download OpenOffice.Org from its web site; it's about 70 MB, or it's sometimes included on the CD or DVD that comes with PC magazines (such as Personal Computer World, PC Pro, Computer Shopper, PC Plus, PC Advisor).
AutoUnbreak's Website is here:
http://digital.hollmen.dk/products/autounbreak/index.htm
Oliver

[Edited at 2006-10-31 20:22]
Collapse


 
Giles Watson
Giles Watson  Identity Verified
Italy
Local time: 21:40
Italian to English
In memoriam
Don't work on PDFs Oct 31, 2006

Viktoria Gimbe wrote:

Download a free tool called AuroUnbreak - look it up on Google, you'll find it fast - and use it.



Just a little caveat about AutoUnbreak. I downloaded it a while ago and then decided to delete the program because my antivirus (Panda) had detected spyware in the package.

I get rid of line breaks using the routine recommended by Yves Champollion in the Wordfast manual:
>
Actually, a one-pass FR can achieve just the same result, but don't tell anyone, because it's a secret:

Find what ([!^0013])([^0013])([!^0013])
Replace with \1 \3
Use Wildcards

(Note the space after \1) Amazing, right? Be cautious though – on some Ms-Word versions, ^0013 introduces a new line but not necessarily a paragraph, as surprising as this may seem… Use this geeky method if you’re a geek yourself and know what you’re doing.
>

I have saved this routine as a macro in Word and added it to my toolbar.



Finally, I recommend you ask for a higher rate when working on PDFs.



Good advice from Viktoria.

I always quote a premium of at least 20-30% for handling PDFs, pointing out that I will apply my standard rate if the customer can supply the text in a CAT-friendly format.

This works wonders. As it happens, I haven't charged anyone a PDF premium for years.

Cheers,

Giles


 
Viktoria Gimbe
Viktoria Gimbe  Identity Verified
Canada
Local time: 15:40
English to French
+ ...
Missing detail Oct 31, 2006

Oliver, you added the part I forgot to add

I also do exactly the same thing, when I need to make PDFs - although none of my clients have ever requested this. I make PDFs to proofread myself - Acrobat can read out a text loud, so it's an excellent way to check if my style sounds right.

I also use OOo to make PDFs - is there any simpler way to go about this? I think not.

Thanks for mentioning i
... See more
Oliver, you added the part I forgot to add

I also do exactly the same thing, when I need to make PDFs - although none of my clients have ever requested this. I make PDFs to proofread myself - Acrobat can read out a text loud, so it's an excellent way to check if my style sounds right.

I also use OOo to make PDFs - is there any simpler way to go about this? I think not.

Thanks for mentioning it, Oliver! My post was already long enough as it was...
Collapse


 
esperantisto
esperantisto  Identity Verified
Local time: 22:40
Member (2006)
English to Russian
+ ...
SITE LOCALIZER
Some other tools Oct 31, 2006

First, it's important to keep in mind, that the PDF format was designed to be un-editable, so you basically can't work with such files directly (there's, however, Foxit PDF Editor, but I've heard no really positive opinion about this program and thus haven't tried it out). Fortunately, there are some workarounds, unfortunately neither is perfect:
a) extract text only from a PDF with programs such as pdf2text or similar: search the Web. Disadvantage: you can't make a translation look like t
... See more
First, it's important to keep in mind, that the PDF format was designed to be un-editable, so you basically can't work with such files directly (there's, however, Foxit PDF Editor, but I've heard no really positive opinion about this program and thus haven't tried it out). Fortunately, there are some workarounds, unfortunately neither is perfect:
a) extract text only from a PDF with programs such as pdf2text or similar: search the Web. Disadvantage: you can't make a translation look like the original, some PDFdon't produce any real output (of course, scanned PDFs, in many cases PDFs with non-ASCII characters).
b) convert a PDF to MS Word or RTF format. Two best programs of this kind are ScanSoft PDF Converter and SolidConverterPDF. Adobe Acrobat can also do this, but the output is absolutely lousy, don't even try. Disadvantages: huge sizes of output files, sometimes really schizophrenic formatting, no output for scanned PDFs, problems with custom fonts, images, especially vector ones, may become a real nightmare.
c) feed a PDF to an OCR program such as ABBYY FineReader. Disadvantage: very much time for recognizing and checking text, all images turn into bitmaps. But you can OCR also scanned PDFs!

As for creating PDFs, downloading OpenOffice.org only for this is, of course, an overkill (and I really can't understand, why then edit files in MS Word, if you have OOo and can do everything in that brilliant program, MS Office is just garbage compared to OOo). I'd advise PDFcreator: small, simple, free. http//sf.net/projects/pdfcreator.
Collapse


 
Vito Smolej
Vito Smolej
Germany
Local time: 21:40
Member (2004)
English to Slovenian
+ ...
SITE LOCALIZER
AND, above all - Oct 31, 2006

dont forget to charge the client for all this cr*ppy travail with converters and OCR readers etc. You do it once for free, you're doomed to do it again for free.

Dont. Just charge your actual hours.

smo


 
oliver098
oliver098
Local time: 20:40
English to French
TOPIC STARTER
sounds complicated.. Nov 2, 2006

Thanks for the suggestions. A bit more complicated than I expected, but hopefully I'll find one of these methods that works for me if and when I need to do this kind of work.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

working on PDFs






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »