How to translate a PDF file?
Thread poster: Riccardo89

Riccardo89
Italy
Local time: 20:48
Oct 23, 2015

Hello
and as reported by the title
I have to translate a pdf file with SDL trados 2015.
how to start and what is necessary?
I must have an OCR software for example Abby end reader pro?
I am having converted OCR how do I put the file upwards trados 2015?
give me the guidance procedure to proceed the automatic translation?
give me this hand?
thanks await answers is a thank you in advance.


 

Łukasz Gos-Furmankiewicz  Identity Verified
Poland
Local time: 20:48
English to Polish
+ ...
... Oct 23, 2015

As a translator you are not a client's or agency's office assistant and should not need to deal with low-skill tasks relating to file preparation, file formats etc. Those are tasks for basic-level secretaries and in some cases IT technicians or copy/fax personnel, which don't require the qualifications of a professional translator and hence should not be on your time, executed personally by you — or supervised by you on your own side of things (i.e. within the mini-company which you run either literally as a sole-proprietor or metaphorically as a non-company freelancer).

On the other hand, agencies make money on maximizing their prices and minimizing their costs, where one of the ways of minimizing the costs is convicing you that somehow those trivial time-wasting minutiae are your job. Which, once again, they are not.

Still, if you want to open an editable (i.e. text) PDF file in Trados Studio 2015, all you need is to just open the file like any other file. Trados will do the rest. It has a good filter, although the one provided in MS Word 2013 may be better in some situations. Both are probably better than most free OCR software.

Regarding fine-tuning the OCR settings in Abby Fine Reader etc., that's simply not your job. You are neither a secretary nor an IT technician. You need to insist on this or else clients will constantly nibble and chip away at your own professional status and that of all the other translators (indirectly).


 

Alexander Somin  Identity Verified
Germany
Local time: 20:48
Member (2014)
English to Russian
+ ...
Studio 2015 performs OCR Oct 23, 2015

Hi Riccardo,

SDL Trados 2015 performs OCR. You may start working as with a "usual" format and watch the prompts for what to do next. Explore also settings. Studio 2015 may not perform this job flawlessly, so check the OCR feature output.


 

Bernhard Sulzer  Identity Verified
United States
Local time: 14:48
English to German
+ ...
The file must be manageable in the CAT tool Oct 23, 2015

Alexander Somin wrote:

Hi Riccardo,

SDL Trados 2015 performs OCR. You may start working as with a "usual" format and watch the prompts for what to do next. Explore also settings. Studio 2015 may not perform this job flawlessly, so check the OCR feature output.


OCR or no OCR, the file must be manageable in Trados, if there are a thousand tags, it's useless.
Ask for a file that is already prepared/ready to be used in Trados, without thousands of tags. First check what the file looks like in Trados before you start working on it.

Plus see Lukasz's comments.

[Edited at 2015-10-23 20:25 GMT]


 

Riccardo89
Italy
Local time: 20:48
TOPIC STARTER
thank you all :) Oct 24, 2015

ok I run on the integrated guides sdl trados 2015icon_smile.gif

 

Merab Dekano  Identity Verified
Spain
Member (2014)
English to Spanish
+ ...
Advice Oct 24, 2015

Pdf files are not supposed to be translated. These files are for reading.

The reality is that we do receive pdf files and do translate them. There are two ways, basically:

1. If the file contains mainly text and straightforward images, OCR it. I personally use ABBYY FineReader 12. Make sure you monitor what the software is doing. Oftentimes you will need to do some work manually within the OCR environment (don’t just let the program do the work automatically).

2. In some cases (too dense files, different fonts, etc.), just start translating in a new Word file; no CAT tool (short documents).

If you do use OCR software, make sure the file does not get corrupt. It happened to me at least twice. I ran the file through my OCR software, translated it, butit Studio wouldn’t release it (strange error, no solution). I had to copy segments manually in a clean Word file. To avoid last minute stress, “pseudo translate” the file and try to export it even before you start translating.

In either case, it will take more time to translate that when the client provides a Word file that can be input in your CAT tool right away. So, make sure you factor that extra time in your rate. Or even better; charge your normal rate plus hourly fee for the extra preparatory work.

It might sound stupid, but more often that not if you ask your client to send you a Word file instead of a pdf file, you get a reply; “sure, here you go”.


 

Riccardo89
Italy
Local time: 20:48
TOPIC STARTER
I got it :) Oct 25, 2015

Merab Dekano wrote:

Pdf files are not supposed to be translated. These files are for reading.

The reality is that we do receive pdf files and do translate them. There are two ways, basically:

1. If the file contains mainly text and straightforward images, OCR it. I personally use ABBYY FineReader 12. Make sure you monitor what the software is doing. Oftentimes you will need to do some work manually within the OCR environment (don’t just let the program do the work automatically).

2. In some cases (too dense files, different fonts, etc.), just start translating in a new Word file; no CAT tool (short documents).

If you do use OCR software, make sure the file does not get corrupt. It happened to me at least twice. I ran the file through my OCR software, translated it, butit Studio wouldn’t release it (strange error, no solution). I had to copy segments manually in a clean Word file. To avoid last minute stress, “pseudo translate” the file and try to export it even before you start translating.

In either case, it will take more time to translate that when the client provides a Word file that can be input in your CAT tool right away. So, make sure you factor that extra time in your rate. Or even better; charge your normal rate plus hourly fee for the extra preparatory work.

It might sound stupid, but more often that not if you ask your client to send you a Word file instead of a pdf file, you get a reply; “sure, here you go”.


so I have to do with OCR before and then after was converted pdf files to the world step on SDL TRADOS 2015 right?
then after converting the file OCR what do I do then? give me this manual book please
thank you all for the answericon_smile.gif


 

Christine Andersen  Identity Verified
Denmark
Local time: 20:48
Member (2003)
Danish to English
+ ...
Ask the client is really the answer. Oct 26, 2015

Although Studio does have an OCR fiunction, the output is a Word file. In principle you can then convert that back to PDF, or the client can.

However, the round trip may have played havoc with formatting and layout, if the document is to go through further processing.

If it is only to be read, e.g. a birth certificate, pure text or similar, that may not be a problem. As long as the document looks tidy and the layout is recognisable, so you can see how each section of the translation corresponds to the original, then it is probably OK.

If there are graphics, then Studio may or may not handle them well, and it may be a nightmare for a DTP department to sort them out.

If your client has sent you the PDF and asked you to use Studio, it is possible that they know what they are doing, but they may not!


 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 15:48
English to Portuguese
+ ...
Beg to disagree Oct 26, 2015

Merab Dekano wrote:

Pdf files are not supposed to be translated. These files are for reading.


If you mean SCANNED (aka "dead") PDF files, I agree with you. They are equivalent to hard copy, with the advantage of demanding cheaper postage, if any, to be sent anywhere. Their translation includes redoing all DTP work as well.

OTOH PDFs are finally laid-out pieces of work, intended to come up onscreen or on paper always the same way, regardless of whether they are opened on the latest iPhone or a vintage PC running DOS, including a Mac, PCs running Windows or Linux, whatever.

Translating in Word is easier, because it is a word processor, as its name supposedly implies. Text there will reflow (and sometimes create havoc among non-text elements) due to shrinking/swelling in translation.

A PDF is supposedly a laid-out finished publication. It may be generated by ANY softtware capable of printing to a PostScript printer (it's a standard, not a brand), including all the DTP apps and, of course, word processors as well, among countless others.

High-level DTP apps, like InDesign (and its father, PageMaker), QuarkXpress, and FrameMaker are expensive, have a somewhat steep learning curve and, above all, create files that are mutually incompatible. There were some converters attempting to bridge one DTP app to another, but I never saw one that offered minimally acceptable output. Low-level ("amateur") DTP apps are, for instance, Serif PagePlus, Microsoft Publisher, and the opensource Scribus.

If Word were a DTP app, Microsoft would have discontinued its lame Publisher many years ago. AFAIK a Publisher file can be EXported to Word, perhaps for translation, however it canNOT be imported back! Such translation will have to be DTP'ed from scratch.


Trados, MemoQ, and a few other CAT tools can get into a PDF for translation, and supposedly put the text back in the right places, with the right fonts, colors, sizes, etc. However I wonder how a CAT tool copes with text swelling or shrinkage.

In my language pair, text in EN will "swell" randomly from 0% to 20% in char count. This will cause overflow and misalignment in ANY DTP app, which must be corrected. I doubt that any CAT tool has what it takes to fix the layout in the process.

The solution I've been using so far is Infix Pro. I've put together a walk-thru of the process on this page.

I've been told that it is possible to do it with a program called NitroPDF. I recall having tried it briefly and given up.

Quite recently, I was advised of a new option, FormSwift. Haven't had the time to check it out yet.


In any case, if you decide to go forward and translate a PDF, don't forget to charge for DTP work!

As I didn't - and still don't - have a market reference, I have been charging in average USD 10.00 per A4 page for retrofitting the PDF layout after translation via Infix (on top of per-word translation rates, of course). It makes no difference whether the page has only one big "Introduction" or a very complex flowchart or crammed table. So far I don't regret it, and most educated clients consider it more cost-effective than going back to the DTP pasteboard.


 

Roland Salois
United States
There is a plenty of options May 22

As far as I understand your point, you need some piece of software in order to extract plain text from the PDF file to translate it to the very language you need. You're simply able to do that with the every PDF editor program you could find: Acrobat (but it's the most expensive as well), Foxit, PDFfiller, you name it. The last one also provides the tool to convert these files to different extensions, and if it would be better for you to work with Word file, here you go https://www.altoconvertpdftoword.com/ Do hope it would still come in use to you in the near future

 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 15:48
English to Portuguese
+ ...
Three different strategies May 23

Roland Salois wrote:

As far as I understand your point, you need some piece of software in order to extract plain text from the PDF file to translate it to the very language you need. You're simply able to do that with the every PDF editor program you could find: Acrobat (but it's the most expensive as well), Foxit, PDFfiller, you name it. The last one also provides the tool to convert these files to different extensions, and if it would be better for you to work with Word file, here you go https://www.altoconvertpdftoword.com/ Do hope it would still come in use to you in the near future


1. With Infix - the only PDF editor I'm familiar with - you export tagged TXT, XML, or XLIFF and, at the same tag your PDF. Then you translate that tagged file you exported as plain text, without messing up the tags. After this is done, you import that translated file back into the tagged PDF. The tags will take care of putting text where it belongs, with the right font, size, style, etc. While Infix will try to fit, text swelling/shrinking in translation, as well as alignment parameters will require DTP adjustments. As a PDF editor, Infix has the tools to do all of it. (Maybe there are other PDF editors using this strategy, I wouldn't know.)

2. Some CAT tools, probably Trados and MemoQ among them, can trespass into PDF files to translate text. However as the text swells or shrinks in translation, it will get crooked. Then you can use any good PDF editor to fix DTP.

3. You suggest converting the PDF into a Word file, DOC, DOCX, RTF, whatever. IMHO this is by far the worst option at all. Though I have a couple of good file converters, quite often the layout gets crooked in this process. Even if it doesn't, you'll have to fix it after translation and, again IMHO MS Word is definitely not a good option to do DTP. Using Word to do DTP makes as much sense as using PowerPoint to subtitle videos! If I were wrong, Microsoft would have slaughtered and buried their horrible MS Publisher long ago.

The whole point is in the PDF files structure. They are intended to display/print - with the adequate Acrobat Reader - exactly the same publication, regardless of the OS being used, nothing else. So a text block is not necessarily a text block; it may be a bunch of independent lines neatly arranged to look like one.


 

Maxi Schwarz
Local time: 13:48
German to English
+ ...
You mean, using CAT tools May 23

Riccardo89 wrote:

....
I have to translate a pdf file with SDL trados 2015.
how to start and what is necessary?

I translate plenty of PDF files. 90% of my work probably comes in that format. But I don't use CAT tools, and the files are the type that you wouldn't want or need a CAT tool anyway. I do convert must of them using a service by Adobe that I get by monthly subscription, just to take advantage of things like numbers and proper names being readily there for me, but keep in mind that the conversion can do some distortions (8 becoming B or 6 etc.).

I just wanted to point out that the type of text and translation you are doing is pertinent to the question.icon_smile.gif


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How to translate a PDF file?

Advanced search







WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search