https://www.proz.com/forum/sdl_trados_support/113926-how_to_prepare_pdf_files_for_tageditor.html

how to prepare .pdf files for TagEditor?
Thread poster: anamaria bulgariu
anamaria bulgariu
anamaria bulgariu  Identity Verified
Romania
Local time: 09:43
Member (2007)
English to Romanian
+ ...
Aug 29, 2008

I've been meaning to ask this for a long time now and I apologize to those who will say that I need to do some more research because this topic has been discussed in this forum time and time again.

In a nutshell:
I have a client who always sends beautifully arranged .ttx files for me to work on using TagEditor. The interesting thing is that he also sends .pdfs for reference, so my question is: Is there any wa
... See more
I've been meaning to ask this for a long time now and I apologize to those who will say that I need to do some more research because this topic has been discussed in this forum time and time again.

In a nutshell:
I have a client who always sends beautifully arranged .ttx files for me to work on using TagEditor. The interesting thing is that he also sends .pdfs for reference, so my question is: Is there any way I can convert .pdf to .ttx, other than the usual way (i.e converting .pdf into .rtf or plain .doc first) so as to keep formatting, returns, etc because I receive pdfs on regular basis from other clients?
Frankly, I should charge more for all the work I need to put in to keep to the format, but first, I need to make sure that I'm not missing anything when it comes to making my work easier.

Thank you in advance,
Ana
Collapse


 
Peter Linton (X)
Peter Linton (X)  Identity Verified
Local time: 07:43
Swedish to English
+ ...
No PDF to TTX Aug 29, 2008

AFAIK, there is no direct route from PDF to TTX -- and for a very simple reason. A PDF file does not contain sentences or paragraphs. It merely contains a number of 'objects' containing text and/or graphics at particular locations on the page. So there is no way you can extract complete sentences -- and that is a fundamental requirement for TagEditor.

 
juholding
juholding  Identity Verified
Norway
Local time: 08:43
French to Norwegian
+ ...
there is an imperfect way... Aug 29, 2008

I have also recieved pdf files for translation.
What I have done is mark/highlight all the text/content in the pdf file and then from the file menu chosen "save as text". That way I can translate it in Tag Editor.


 
Fernando Toledo
Fernando Toledo  Identity Verified
Spain
Local time: 08:43
German to Spanish
But, but Aug 29, 2008

You get the TTX files for translation and PDFs for reference.
Well, you should be happy! you get not only a part of the job made (TTX) you get also references that are very useful in PDF format for paging, copy&pasting, for printing in a good made format and for searching that you can let open in a 2 monitor.

Why do you want to transform references into TTX?


... See more
You get the TTX files for translation and PDFs for reference.
Well, you should be happy! you get not only a part of the job made (TTX) you get also references that are very useful in PDF format for paging, copy&pasting, for printing in a good made format and for searching that you can let open in a 2 monitor.

Why do you want to transform references into TTX?


Regards

Fernando

www.lenguatik.com
Collapse


 
Eric Le Carre
Eric Le Carre  Identity Verified
France
Local time: 08:43
English to French
+ ...
These free PDF converters might help you Aug 29, 2008

Hi Annamaria,

try your luck with these free PDF converters at: http://www.somepdf.com/downloads.html, especially this tool 'Some PDF to TXT Converter 1.4' might be useful.

For my part, I'm using 'Some PDF to Word Converter 1.4', and the results are outstanding for a free tool.

Best regards,

Eric


 
Hynek Palatin
Hynek Palatin  Identity Verified
Czech Republic
Local time: 08:43
Member (2003)
English to Czech
+ ...
Source files Aug 29, 2008

I have a client who always sends beautifully arranged .ttx files for me to work on using TagEditor. The interesting thing is that he also sends .pdfs for reference.


Your client has the source files in a DTP format. They can import them to Trados and they can also export them to PDF. They do not convert PDF to TTX.


 
Krzysztof Kłonica
Krzysztof Kłonica  Identity Verified
Poland
Local time: 08:43
English to Polish
+ ...
Source files Aug 29, 2008

I'm not sure about this theory of mine, but I think agencies get source files from their clients. I mean files from which pdf files are created. They can be all sorts of files formats related to dtp software. With those files you can probably easily generate ttx files - tagged ones. This theory might be true for big agencies. They are big enough to have their own copies of different dtp software packages. They provide translators with pdf files for reference only and do not use them to generate ... See more
I'm not sure about this theory of mine, but I think agencies get source files from their clients. I mean files from which pdf files are created. They can be all sorts of files formats related to dtp software. With those files you can probably easily generate ttx files - tagged ones. This theory might be true for big agencies. They are big enough to have their own copies of different dtp software packages. They provide translators with pdf files for reference only and do not use them to generate ttx files. But as I mentioned before it is just a guess.
I know some translators who have their own dtp software, but it is rather expensive.
If you don't have many clients to send you source files, it is rather not worth it to purchase such software and you have to stick to pdf to txt/doc converters or OCR software, which is exactly what I do.
Just recently, I've been using a nice little software called AutoUnbreak. It generates text files from pdf files keeping all the font formatting.
Collapse


 
anamaria bulgariu
anamaria bulgariu  Identity Verified
Romania
Local time: 09:43
Member (2007)
English to Romanian
+ ...
TOPIC STARTER
Guess? Aug 29, 2008

stake wrote:

I'm not sure about this theory of mine, but I think agencies get source files from their clients. I mean files from which pdf files are created. They can be all sorts of files formats related to dtp software. With those files you can probably easily generate ttx files - tagged ones. This theory might be true for big agencies. They are big enough to have their own copies of different dtp software packages. They provide translators with pdf files for reference only and do not use them to generate ttx files. But as I mentioned before it is just a guess.
I know some translators who have their own dtp software, but it is rather expensive.
If you don't have many clients to send you source files, it is rather not worth it to purchase such software and you have to stick to pdf to txt/doc converters or OCR software, which is exactly what I do.
Just recently, I've been using a nice little software called AutoUnbreak. It generates text files from pdf files keeping all the font formatting.


Well, first, thank you all for your answers. I figured that there was no easy way to get this done because I did do some research myself.

As for your guess, Stake, it might be quite true. I was reluctant to ask the agency myself, but I probably will, just to satisfy my "burning" curiosity.
Most likely, they have good enough DTP software because they said they have some DTP team up their sleeve. From my point of view, this is one of the reasons that justify higher rates of agencies.

And the "Save As" solution, hmm... desastrous, especially on tables and... desastrous says it all. It would take a whole lot of time just to eliminate the extra signs present in the converted text.

Thank you again. It was worth a try.
Ana


 
anamaria bulgariu
anamaria bulgariu  Identity Verified
Romania
Local time: 09:43
Member (2007)
English to Romanian
+ ...
TOPIC STARTER
turning reference into ttx is not the problem Aug 29, 2008

Fernando Toledo wrote:

Why do you want to transform references into TTX?



Dear Fernando,

I have various clients. Some send .ttx (for which I'm so thankful that I can offer discounts) and some just .pdfs. My problem comes up when it's up to me to do the formatting and everything else to make my translation look like the original pdf.

Up to this point... I just said: if I need to do it, I need to do it and that is it, but it takes too much of my time and productivity is quite low this way.

My sincere thanks goes out to Eric Le Carre. I have never seen anything like it so far and I have used quite a few of these "imperfect solutions" for .pdf conversion. Quick and quite accurate too!!! not to mention FREE.


 
Pablo Bouvier
Pablo Bouvier  Identity Verified
Local time: 08:43
German to Spanish
+ ...
Get a translated PDFs in 5 steps... Aug 29, 2008

I don't know any procedure to easy translate pdfs without some previous working.
As I usual receive a lot of pdfs to translate (germans are not quite clever in this sense) , I use following procedure:

1) PDF -> RTF (Real page format) with SolidConverterPDF
This converts from PDF to RTF and allows to preserve 80% of the format, because all the text will be exported into MS-Word textboxes.

2) RTF Textboxes -> RTF (text) with Werecat macro
This macro ext
... See more
I don't know any procedure to easy translate pdfs without some previous working.
As I usual receive a lot of pdfs to translate (germans are not quite clever in this sense) , I use following procedure:

1) PDF -> RTF (Real page format) with SolidConverterPDF
This converts from PDF to RTF and allows to preserve 80% of the format, because all the text will be exported into MS-Word textboxes.

2) RTF Textboxes -> RTF (text) with Werecat macro
This macro extracts text from MS-QWord textboxes to tagged text.

3) Translate it with your CAT as usual.

4) Use Werecat macro to put your translated text back to textboxes.

5) Save your RTF files as PDF with PrimoPDF or something alike.

Best regards.
Pablo B.



[Editado a las 2008-08-29 20:50]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

how to prepare .pdf files for TagEditor?


Translation news related to SDL Trados





Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »