How does a PDF file get translated? - workflows
Thread poster: José Henrique Lamensdorf

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 14:26
English to Portuguese
+ ...
Sep 2, 2011

I had to explain it to a PM yesterday, so I'm taking the chance to share it with the Proz community. Maybe it will help people review/adjust their options.

The process envisioned here involves three people: PM, Translator, Reviewer. A DTP artist sometimes joins the cast.

It is assumed that the end-client sent a software-generated (aka "distilled") PDF, not a scanned one; though a scanned one would be covered by Process D below.

Process A:
1. PM gets a PDF, and uses e.g. SolidConverter to convert it into Word.
2. PM obtains a DOC file often crammed with text boxes, which drives most - if not all - CAT tools crazy.
3. Translator goes crazy trying to adjust layout with Word, burns the midnight oil to make PM happy.
4. Any changes by Reviewer causes text reflow, and Reviewer burns the midnight oil to make PM happy.
5. The result doesn't look so neat, graphically, yet PM distills it into a PDF and delivers.

Process B:
1. PM gives Translator the PDF.
2. Translator uses Trados to work on it directly.
3. Editor can't change much there without causing layout havoc. Only way is to edit the TM and have Translator re-run Trados on the original.
4. PM gets a translated PDF, yet the layout is a real mess, some overflowing text vanished.
5. PM has no other option than delivering it as-is and facing the music.

Process C:
1. PM demands the original DTP file from the client, crossing fingers that they still have it. Sometimes they do.
2. PM sends that file to a DTP artist, skilled in that specific DTP application (typically any from InDesign, PageMaker, FrameMaker, QuarkXpress, MS Publisher, Serif PagePlus, Scribus... or possibly a complex-layout Word file).
3. DTP artist extracts all text and builds a 2-col table in Word or Excel for the translator to work on.
4. Translator translates the left column on the right one.
5. DTP artist implements translations on the DTP application, one text block at a time. Distills into a PDF.
6. Editor reviews the PDF.
7. DTP artist implements corrections.
8. Steps 6-7 get repeated ad nauseam. (Last time I had this process it took 8 times - DTP artist didn't know squat about the target language).
9. DTP artist distills file to final PDF.

Process D (I used it for 20+ years, playing both Translator & DTP roles - it is still my option for scanned PDFs):
1. Either PM or Translator does OCR on the PDF, getting the text.
2. Translator translates that text.
3. DTP artist extracts all graphic elements from the PDF.
4. DTP artist puts the PDF pages, one by one, as background in the respective pages, e.g. using PageMaker.
5. DTP artist formats all translated text, puts the pictures in place, re-creates all graphic elements, one by one (a labor-intensive process), rebuilding the entire pub, deletes the background pages used as a guide/template, and distills it into a PDF.
6. Editor goes over that PDF and adds sticky notes to all they want changed, and sends back to DTP artist.
7. DTP artist makes changes according to each sticky note, and distills a new PDF.
8. Steps 5-6 get repeated as needed.
9. DTP artist distill a final PDF, and PM delivers.

Process E: (the one I'm using now, the reason to explain these options to the PM - again, I'm playing both the Translator & DTP roles).
1. PM sends PDF file to DTP artist.
2. DTP artist uses InFix Pro to export all text from the PDF into either XML or tagged TXT (and tags PDF as well).
3. Translator translates the XML or TXT using any CAT tool they like.
4. Reviewer reviews the translation, and sends to DTP artist.
5. DTP artist uses InFix to import the translated XML or TXT file into the tagged PDF, to solve partially embedded font avalability issues, and fixes layout problems arising from reflowing ("swollen" or "shrunk") text, getting a translated PDF.
6. Reviewer does a final check on the PDF. May use either InFix or Acrobat (full - not the free Reader) to fix - hopefully minor - issues.
7. PM gets final file and delivers to end-client.


These are the workflows I know to get PDF files translated. Colleagues are invited to add any other ones they use, as well as variants from these.


 

Janet Ross Snyder  Identity Verified
Canada
Local time: 14:26
Member (2006)
French to English
+ ...
Word-only Sep 2, 2011

The process I use.
I receive the pdf to be translated.
I open a new document in Word.
I check my archives for similar page formats to use as templates and copy those pages into the new document, as appropriate.
I re-create the pdf document in Word, matching format as closely as possible, page for page and line for line, and translating from source language to target languge.
I submit Word document to the agency.


 

Tom in London
United Kingdom
Local time: 18:26
Member (2008)
Italian to English
Not me Sep 2, 2011

I don't accept pdfs.

I'm a translator.

Converting pdfs into usable text is somebody else's job.

[Edited at 2011-09-02 12:25 GMT]


 

nikolinadp  Identity Verified
Local time: 19:26
Spanish to Croatian
+ ...
thanks for sharing Sep 2, 2011

Interesting post, thanks José Henrique.

I'd have a question: Do you charge more when you play, as you say, both Translator & DTP roles (process D)?


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 14:26
English to Portuguese
+ ...
TOPIC STARTER
DTP charges Sep 2, 2011

nikolinadp wrote:
Interesting post, thanks José Henrique.
I'd have a question: Do you charge more when you play, as you say, both Translator & DTP roles (process D)?


Of course, Nikolina! I'm playing two roles there. However the total, viz. translation + DTP is obviously cheaper than it would be if the client sourced each from a different vendor, as many steps can be carried out jointly.

Meanwhile, for process "E" I'm tentatively charging a flat fee per page on top of the translation cost. For the time being it's about half the cost of DTP in process "D", and a smaller fraction, between 1/3 and 1/4 of the average prices in the marketplace for the DTP in process "C".


 

David Wright  Identity Verified
Austria
Local time: 19:26
German to English
+ ...
Why so complex Sep 2, 2011

I take the pdf, print it out and translate it (using dictation machine and Dragon).

Any formatting issues are simply an SEP (someone else's problem).

OCR is not so reliable that I want to use it, and frankly woulnd't save me any time.

Nor have I ever ever been expected to do anything else!


 

nikolinadp  Identity Verified
Local time: 19:26
Spanish to Croatian
+ ...
I thought so Sep 2, 2011

José Henrique Lamensdorf wrote:

Of course, Nikolina! I'm playing two roles there. However the total, viz. translation + DTP is obviously cheaper than it would be if the client sourced each from a different vendor, as many steps can be carried out jointly.

Meanwhile, for process "E" I'm tentatively charging a flat fee per page on top of the translation cost. For the time being it's about half the cost of DTP in process "D", and a smaller fraction, between 1/3 and 1/4 of the average prices in the marketplace for the DTP in process "C".



Thank you for that, José!

The thing is I have a client who always sends me scanned documents and I didn't mind re-creating the original document, it wasn't that time-consuming either, but the last project was a manual with images and tables and colors, and impossible to convert. So when I said I would charge a bit more for the formating, they said it was the first time someone told that to them (but they did accept it).. So I just wanted to know what do my experienced colleagues doicon_smile.gif


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 14:26
English to Portuguese
+ ...
TOPIC STARTER
That's a big problem in the translation industry Sep 2, 2011

nikolinadp wrote:
The thing is I have a client who always sends me scanned documents and I didn't mind re-creating the original document, it wasn't that time-consuming either, but the last project was a manual with images and tables and colors, and impossible to convert. So when I said I would charge a bit more for the formating, they said it was the first time someone told that to them (but they did accept it).. So I just wanted to know what do my experienced colleagues doicon_smile.gif


It involves client education.

Many people in business nowadays never saw translation being done with a Smith-Corona typewriter or a Parker fountain pen. So they ask 'how much do you charge per word/page/whatever?' and think it's the total cost of having some artsy publication translated and laid out exactly as the original. Many translators never saw a paste-up art studio, nor DTP software, so they kill themselves trying to rebuild such publications using MS Word, definitely the wrong tool to do accurate layout work, and charge nothing for this work that often takes more time and effort than the translation itself.

The missing lesson is that MS Word is a word processor, therefore a virtual typewriter packed with tons of sci-fi-like resources, but still a typewriter. Meanwhile a DTP application is the proper software for layout work. Incidentally, the all-time DTP pioneer PageMaker (and possibly its son, InDesign) replicates accurately a paste-up studio and all its formerly external vendors (such as typesetting) in virtual reality.

One specific lower-end DTP program, Serif PagePlus, really drew the line. I only saw its early versions, haven't seen the later ones in years. It had absolutely NO text formatting resources, only layout tools. All the text should come ready from a word processor. If any changes beyond a few letters were needed to the text, it should be edited on the word processor and transferred again.

So these are two separate jobs. If a taxi driver could do the grocery shopping and bring it home for you, they should certainly charge more than simply the cab fare on the meter, right?


 

Hege Jakobsen Lepri  Identity Verified
Local time: 13:26
Member (2002)
English to Norwegian
+ ...
Used to be a problem... Sep 2, 2011

because my OCR software wasn't very good, but since I started using Wordfast Anywhere, I get converted files with all logos & signatures in rtf.

 

Ambrose Li  Identity Verified
Canada
Local time: 13:26
Chinese to English
+ ...
WordFast Anywhere and PDF files Sep 2, 2011

By chance I just tried using WordFast anywhere on a short PDF file not long ago. I tried it and then immediately gave up. The file had rotated text boxes (in a table, and it was still real text that I can copy-and-paste, not outlined and not rasterized) that really confused the OCR. So no, I don’t think OCR will be able to handle all PDF files.

 

Cedomir Pusica  Identity Verified
Serbia
Local time: 19:26
Member (2009)
English to Serbian
+ ...
9 pages an hour Sep 4, 2011

Hi all,

I had a 72 page document in a nicely laid .pdf format some 10 days ago.

It took me 8 hours to properly prepare it for translation and get rid of all Word tags, text boxes, etc. This should be reflected in your price, provided you still want to play with it.

However, I told the client that I would not keep exactly the same formatting as the original and that there will be no pictures in the resulting document, which they accepted.

Another example: an extremely complex and formatting rich 300 pages files. I decided to ask the client who did the DTP for them, contacted the agency and asked them for an editable format: .inx in this case (InDesign). It worked perfectly well.

PDF's are the reality and if you want to deal with it, you'd better be ready to charge for it. My experience shows that it takes about one hour to prepare 9 pages.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How does a PDF file get translated? - workflows

Advanced search







WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search