Translate in memoQ starting from a PDF
Thread poster: Valeria Ricciardi

Valeria Ricciardi  Identity Verified
Italy
Local time: 10:40
Member (2005)
English to Italian
+ ...
Jan 15

Hi everyone.

I was wondering if there is a way to use PDFs in memoQ 8.2 or it is mandatory to first convert the document in Word format to be able to use it with this CAT?

Thank you for your precious support

Valeria


 

Thomas T. Frost  Identity Verified
Member (2014)
Danish to English
+ ...
Yes Jan 15

You can import pdf directly in MemoQ, but as far as I remember, the target file will be written in Word format, unless they have changed it in version 8, which I don't use.

 

Valeria Ricciardi  Identity Verified
Italy
Local time: 10:40
Member (2005)
English to Italian
+ ...
TOPIC STARTER
Great! Jan 15

Thank you so much Thomas! I see memoQ 8.1 and above have an integrated tool which I didn't notice before - as a matter of fact I have never received PDFs before!

Thanks again and regards

Valeria


 

Maija Cirule  Identity Verified
Latvia
Local time: 11:40
Member (2014)
German to English
+ ...
It depends on the PDF Jan 16

Valeria Ricciardi wrote:

Hi everyone.

I was wondering if there is a way to use PDFs in memoQ 8.2 or it is mandatory to first convert the document in Word format to be able to use it with this CAT?

Thank you for your precious support

Valeria

If the file is scanned, the only way to translate it in MQ 2015 is to transfer it to Word, and I am not sure if with MQ 8 it may be otherweise


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 06:40
English to Portuguese
+ ...
How a PDF is born Jan 16

Perhaps this info has been lost in the past...

Many years ago, Adobe invented PostScript, which was a printing language (and a printer standard as well). Any file, no matter how complex, could be "printed" to a *.ps file. If that file were sent to a PostScript printer (either a domestic laser or an industrial phototypesetter), the resulting printout would be exactly the same, limited to the physical resolution of that printer.

Then Adobe invented the PDF. It was the result of "distilling" (as they named it) such *.ps files into (PDF = Portable Document Format) files that, using the Acrobat Reader program devised for the specific system at hand (Windows, DOS, MacOS, Linux, etc.) would open exactly the same.

So a distilled PDF has all its contents organized within its self in a standard manner. To translate it, it's a matter of accessing the text therein, and replacing it with the corresponding translation.

This is easier said than done, as text will often swell or shrink in translation. Furthermore, it makes no difference if a title is centered on the page, or left-aligned to a left margin that will place it centered on the page... until it changes length during translation, and is no longer centered.


A short detour should be made here.

Microsoft Word - as its name implies - is a word processor. Its original paradigm is the typewriter. Evidence of that is that you must always start on the first page (no matter what's its number), and go adding text, like filling a sausage. If you add a paragraph, say, on page 22, all ensuing pages will reflow.

DTP, short for DeskTop Publishing is a different game. Its original paradigm is the paste-up/art studio, and its pioneer software, Page Maker (InDesign's "father") follows it closely. Other DTP programs have their own paradigms, unparalleled in real life. A DTP program is able to handle very complex layouts accurately. You can work, say, on page 22, then on 60, then on 4 and, with the exception of text blocks spreading over multiple pages, nothing will move from where it is.


Back to our case...

Several of the CAT tools, MemoQ among them, can trespass into PDF files, so you can translate the text there within. However the layout in the translation will be cockeyed as the text length changes in each block.

So one solution was found by developing converters from PDF into DOC/X. This would enable translators to fix the layout after translation using MS Word. Actually, most found it easier to translate on the converted DOC/X file.

If it's a plain-text book, this will be an easy way out. However if the layout is something more complex, with charts, tables, callouts, labels, etc., assembled with a DTP app, Word is not an adequate tool to deal with it.

The recent trend is towards PDF editors, programs that will enable translators (and other folks too!) to adjust the layout on PDF files, so that their translation can again fit neatly into the allotted space.

One such PDF editor is Infix - from http://www.iceni.com. Its basic m.o. is to export all text from a PDF into tagged TXT, XML, or XLIFF format for translation, while tagging the PDF too; then it gets translated outside the PDF using the CAT tool of your choice; and finally the translation is imported back to the PDF, every chunk of text in the right place, with the right font, etc. etc. Well, not finally. However as a PDF editor, then it has all DTP tools for the translator to make all the necessary layout adjustments, to get a pristine translated PDF.

There are other PDF editors. You may use MemoQ (or other CAT tool) to trespass into the PDF, translate the text, and later use one such editor to fix the outcoming layout manually.


Of course, this applies only to "distilled" (aka "live" or "editable") PDFs. If you get a scanned, or "dead", PDF, you'll have to do OCR on it. Again, if it's plain, streaming text, MS Word can handle it. If the layout is complex, it will have to be rebuilt with a DTP app.

Again, Infix has some tools (including OCR) to do it. however I'm so fast with Page Maker after having used it for so many years, that I have never used them.


 

Valeria Ricciardi  Identity Verified
Italy
Local time: 10:40
Member (2005)
English to Italian
+ ...
TOPIC STARTER
Have imported my PDfs into memoQ and translated them. I keep receiving an error message during expor Jan 18

Hi again. Thank you so much for all the precious info in the previous contact.

I succeeded in importing the PDFs with TransPDF and translate them. Now I should export them in Stored path to retrieve the final PDF (with memoQ 8.2 this feature is available) but something prevents me from doing so. I keep receiving an error message saying Could not upload the stored file to the TransPDF service. Check your connection and try again.

I was able to see the final PDF of one of the three PDF to translate just once. Then I discarded it because there were typos and I wasn't able to get the target PDF again.

I don't have a clue at this point. I restarted my PC, log out from TransPDF account and signed it again as indicated by memoQ support. But nothing changed and I am desperate right now because my client is waiting.

Thank you for your help

Best
Valeria


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 06:40
English to Portuguese
+ ...
Try Infix then Jan 18

Valeria Ricciardi wrote:

I don't have a clue at this point.


By this time you must have a complete TM on MemoQ.

Download Infix free demo from http://www.iceni.com . Open the PDF, export the tagged text as tagged XML or TXT, at your choice, and save the tagged PDF under a different file name.

Run your exported text file through MemoQ, and save the translation.

Open the tagged PDF file on Infix, and import your translation there. Save that PDF.

Then check the resulting layout, and adjust whatever you want with the tools available on Infix.

The only hitch in doing it with a free demo is that you'll get every page rubber-stamped with a note that you did it with a free demo. If you buy their license, just open the PDF with the registered Infix, save it, and that watermark will go away. They also have a pay-per-save economy plan.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translate in memoQ starting from a PDF

Advanced search






Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search