Translate in memoQ starting from a PDF
Thread poster: Valeria Ricciardi

Valeria Ricciardi  Identity Verified
Italy
Local time: 17:54
Member (2005)
English to Italian
+ ...
Jan 15

Hi everyone.

I was wondering if there is a way to use PDFs in memoQ 8.2 or it is mandatory to first convert the document in Word format to be able to use it with this CAT?

Thank you for your precious support

Valeria


 

Thomas T. Frost  Identity Verified
Member (2014)
Danish to English
+ ...
Yes Jan 15

You can import pdf directly in MemoQ, but as far as I remember, the target file will be written in Word format, unless they have changed it in version 8, which I don't use.

 

Valeria Ricciardi  Identity Verified
Italy
Local time: 17:54
Member (2005)
English to Italian
+ ...
TOPIC STARTER
Great! Jan 15

Thank you so much Thomas! I see memoQ 8.1 and above have an integrated tool which I didn't notice before - as a matter of fact I have never received PDFs before!

Thanks again and regards

Valeria


 

Maija Cirule  Identity Verified
Latvia
Local time: 18:54
Member (2014)
German to English
+ ...
It depends on the PDF Jan 16

Valeria Ricciardi wrote:

Hi everyone.

I was wondering if there is a way to use PDFs in memoQ 8.2 or it is mandatory to first convert the document in Word format to be able to use it with this CAT?

Thank you for your precious support

Valeria

If the file is scanned, the only way to translate it in MQ 2015 is to transfer it to Word, and I am not sure if with MQ 8 it may be otherweise


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 12:54
English to Portuguese
+ ...
How a PDF is born Jan 16

Perhaps this info has been lost in the past...

Many years ago, Adobe invented PostScript, which was a printing language (and a printer standard as well). Any file, no matter how complex, could be "printed" to a *.ps file. If that file were sent to a PostScript printer (either a domestic laser or an industrial phototypesetter), the resulting printout would be exactly the same, limited to the physical resolution of that printer.

Then Adobe invented the PDF. It was the result of "distilling" (as they named it) such *.ps files into (PDF = Portable Document Format) files that, using the Acrobat Reader program devised for the specific system at hand (Windows, DOS, MacOS, Linux, etc.) would open exactly the same.

So a distilled PDF has all its contents organized within its self in a standard manner. To translate it, it's a matter of accessing the text therein, and replacing it with the corresponding translation.

This is easier said than done, as text will often swell or shrink in translation. Furthermore, it makes no difference if a title is centered on the page, or left-aligned to a left margin that will place it centered on the page... until it changes length during translation, and is no longer centered.


A short detour should be made here.

Microsoft Word - as its name implies - is a word processor. Its original paradigm is the typewriter. Evidence of that is that you must always start on the first page (no matter what's its number), and go adding text, like filling a sausage. If you add a paragraph, say, on page 22, all ensuing pages will reflow.

DTP, short for DeskTop Publishing is a different game. Its original paradigm is the paste-up/art studio, and its pioneer software, Page Maker (InDesign's "father") follows it closely. Other DTP programs have their own paradigms, unparalleled in real life. A DTP program is able to handle very complex layouts accurately. You can work, say, on page 22, then on 60, then on 4 and, with the exception of text blocks spreading over multiple pages, nothing will move from where it is.


Back to our case...

Several of the CAT tools, MemoQ among them, can trespass into PDF files, so you can translate the text there within. However the layout in the translation will be cockeyed as the text length changes in each block.

So one solution was found by developing converters from PDF into DOC/X. This would enable translators to fix the layout after translation using MS Word. Actually, most found it easier to translate on the converted DOC/X file.

If it's a plain-text book, this will be an easy way out. However if the layout is something more complex, with charts, tables, callouts, labels, etc., assembled with a DTP app, Word is not an adequate tool to deal with it.

The recent trend is towards PDF editors, programs that will enable translators (and other folks too!) to adjust the layout on PDF files, so that their translation can again fit neatly into the allotted space.

One such PDF editor is Infix - from http://www.iceni.com. Its basic m.o. is to export all text from a PDF into tagged TXT, XML, or XLIFF format for translation, while tagging the PDF too; then it gets translated outside the PDF using the CAT tool of your choice; and finally the translation is imported back to the PDF, every chunk of text in the right place, with the right font, etc. etc. Well, not finally. However as a PDF editor, then it has all DTP tools for the translator to make all the necessary layout adjustments, to get a pristine translated PDF.

There are other PDF editors. You may use MemoQ (or other CAT tool) to trespass into the PDF, translate the text, and later use one such editor to fix the outcoming layout manually.


Of course, this applies only to "distilled" (aka "live" or "editable") PDFs. If you get a scanned, or "dead", PDF, you'll have to do OCR on it. Again, if it's plain, streaming text, MS Word can handle it. If the layout is complex, it will have to be rebuilt with a DTP app.

Again, Infix has some tools (including OCR) to do it. however I'm so fast with Page Maker after having used it for so many years, that I have never used them.


 

Valeria Ricciardi  Identity Verified
Italy
Local time: 17:54
Member (2005)
English to Italian
+ ...
TOPIC STARTER
Have imported my PDfs into memoQ and translated them. I keep receiving an error message during expor Jan 18

Hi again. Thank you so much for all the precious info in the previous contact.

I succeeded in importing the PDFs with TransPDF and translate them. Now I should export them in Stored path to retrieve the final PDF (with memoQ 8.2 this feature is available) but something prevents me from doing so. I keep receiving an error message saying Could not upload the stored file to the TransPDF service. Check your connection and try again.

I was able to see the final PDF of one of the three PDF to translate just once. Then I discarded it because there were typos and I wasn't able to get the target PDF again.

I don't have a clue at this point. I restarted my PC, log out from TransPDF account and signed it again as indicated by memoQ support. But nothing changed and I am desperate right now because my client is waiting.

Thank you for your help

Best
Valeria


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 12:54
English to Portuguese
+ ...
Try Infix then Jan 18

Valeria Ricciardi wrote:

I don't have a clue at this point.


By this time you must have a complete TM on MemoQ.

Download Infix free demo from http://www.iceni.com . Open the PDF, export the tagged text as tagged XML or TXT, at your choice, and save the tagged PDF under a different file name.

Run your exported text file through MemoQ, and save the translation.

Open the tagged PDF file on Infix, and import your translation there. Save that PDF.

Then check the resulting layout, and adjust whatever you want with the tools available on Infix.

The only hitch in doing it with a free demo is that you'll get every page rubber-stamped with a note that you did it with a free demo. If you buy their license, just open the PDF with the registered Infix, save it, and that watermark will go away. They also have a pay-per-save economy plan.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Translate in memoQ starting from a PDF

Advanced search






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
SDL Trados Studio 2017 only €435 / $519
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €435 / $519 / £345 / ¥63000 You will also receive FREE access to Studio 2019 when released.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search