Pages in topic:   [1 2] >
Choosing a CAT tool for translating PDF's and Word documents
Thread poster: Translator0101
Translator0101
Croatia
Sep 11, 2014

Hello,

I'm not a pro translator, I just have two cookbooks that I want to translate from English to Croatian.

Now, I can also buy a piece of software, because my cousin needs translator software anyway (he is a pro translator).

Anyways, I'm looking for a software that will do the task at hand (translating these two cookbooks) the most efficient way.

I've been acquainted with MemoQ and SDL Trados, but I don't know which one to choose.

Best regards!


Direct link Reply with quote
 

Natalie  Identity Verified
Poland
Local time: 00:23
Member (2002)
English to Russian
+ ...

Moderator of this forum
Please check the existing threads in this forum Sep 11, 2014

Choosing a CAT tool for the first time http://www.proz.com/topic/267094
CAT tools - which tool? http://www.proz.com/topic/266744
Which CAT tool is the best? http://www.proz.com/topic/233380
CAT System choice, can you help? http://www.proz.com/topic/270800
looking for a CAT tool http://www.proz.com/topic/270427
Why should I purchase CAT software? http://www.proz.com/topic/260353
CAT within Word http://www.proz.com/topic/272846
Affordable CAT tool to improve productivity http://www.proz.com/topic/270967


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:23
Member (2006)
English to Afrikaans
+ ...
OmegaT (it's free) Sep 11, 2014

Translator0101 wrote:
I'm not a professional translator. I just have two cookbooks that I want to translate from English to Croatian. ... I'm looking for software that will do the task at hand (translating these two cookbooks) the most efficient way.


OmegaT is free, and will be suited for your translation project.

MemoQ and Trados 2014 are both equally suited for your project, but they are expensive and can be complex to use. And if you choose OmegaT now and later decide to get some of the other tools, the skills that you've learnt in OmegaT will still be useful.

https://sourceforge.net/projects/omegat/files/OmegaT%20-%20Latest/

Added: Oh, I just noticed that your subject heading says "Choosing a CAT tool for translating PDF's and Word documents". OmegaT can't translate PDF files. I'm not sure to what degree other CAT tools can actually translate PDF files. As far as I know, both MemoQ and Trados simply converts the PDF file to MS Word, and then translate it, and then optionally attempts to convert it back.



[Edited at 2014-09-11 11:50 GMT]


Direct link Reply with quote
 
Danik 2014
Brazil
German to Portuguese
+ ...
MateCat Sep 11, 2014

There is a beta trial version of this new tool available on line. It is easy to use, an ideal tool for beginners. It works well with Word, Excel and One Note files, I´m not so sure about PDF.

Direct link Reply with quote
 
Translator0101
Croatia
TOPIC STARTER
Just checking Sep 12, 2014

Samuel Murray wrote:

Translator0101 wrote:
I'm not a professional translator. I just have two cookbooks that I want to translate from English to Croatian. ... I'm looking for software that will do the task at hand (translating these two cookbooks) the most efficient way.


OmegaT is free, and will be suited for your translation project.

MemoQ and Trados 2014 are both equally suited for your project, but they are expensive and can be complex to use. And if you choose OmegaT now and later decide to get some of the other tools, the skills that you've learnt in OmegaT will still be useful.

https://sourceforge.net/projects/omegat/files/OmegaT%20-%20Latest/

Added: Oh, I just noticed that your subject heading says "Choosing a CAT tool for translating PDF's and Word documents". OmegaT can't translate PDF files. I'm not sure to what degree other CAT tools can actually translate PDF files. As far as I know, both MemoQ and Trados simply converts the PDF file to MS Word, and then translate it, and then optionally attempts to convert it back.



[Edited at 2014-09-11 11:50 GMT]


OK, so you're sure that OmegaT is the best software for me?


Direct link Reply with quote
 

Didier Briel  Identity Verified
France
Local time: 00:23
Member (2007)
English to French
+ ...
OmegaT can read PDF files Sep 12, 2014

Samuel Murray wrote:
Added: Oh, I just noticed that your subject heading says "Choosing a CAT tool for translating PDF's and Word documents". OmegaT can't translate PDF files.

OmegaT can import the text of textual PDF files, and create a text file with the translation.

OmegaT can also translate Iceni Infix PDF extraction files.

Didier


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 00:23
Member (2006)
English to Afrikaans
+ ...
@0101 Sep 12, 2014

Translator0101 wrote:
Samuel Murray wrote:
OmegaT is free, and will be suited for your translation project.

OK, so you're sure that OmegaT is the best software for me?


I can't say whether it is the "best" for you, but I can recommend it for your project.


Direct link Reply with quote
 
MamaG  Identity Verified
United States
Local time: 18:23
French to English
+ ...
Every CAT tool isn't for everybody Sep 17, 2014

I think it would be super-handy to have a poll or questionnaire that allows you to use a process of elimination in order to figure out what CAT tool is right for you as a function of what you tend to "do" as a freelancer. What your profile and lifestyle are, etc. I just bought SDL Trados Freelancer Studio 2014, and it was a big mistake, I can see now, even after a couple training seminars. The last I had worked with Trados was with the 2003 Workbench, I think it was called. For what you got paid (you don't get paid for all the fiddling with tags and errors and whatnot - you get paid for matches), the hassle was just not worth it. I chose SDL because it seems to be the industry standard, but the type of work I do does not really fit into an "industrial" context. My work is not the assembly-line sort. Trados is a very big, resource-intensive, expensive program that seems to be intended to be used with the kinds of ongoing projects/repeat jobs that large corporations generate, and with text that is already provided to you in a process-able format. I think the people who do well with it are people who work with it in companies or who have a lot of time to learn it (so can take some time off of income earning to actually sit down with it, study it, risk making mistakes, risk losing income etc.), a natural gift for understanding IT/technical matters in the first place, a few regular clients, same-ish material from those clients, and preferred limited focus as far as area of specialty. Given the type of work I myself tend to get (pretty varied subject areas, more than one language pair, small-ish docs in original or "only" versions that no one else has worked on yet, and text that cannot always be extracted unmessily from the source document), I should have opted for something much simpler. It has more features than I will ever begin to understand or use, even after two training courses, and was a huge investment and sacrifice for my family to make. I suspect I should have started with something much simpler and cheaper. I have asked whether they give refunds or not, but from what I have read on here, they don't do that, so I suspect I will have to just swallow it and try to recover. For the "little guy" freelancer with small-to-medium-sized varied clients and material, I would just not recommend that monster program. You do not need that many features, and it is designed for people who work on assembly-line type projects for large corporations that give regular same-ish work. As for which other tool would be good for me or for the other folks Trados hasn't worked out for, I don't know. I did a lot of research before buying it, and ultimately chose it because of its "established" nature, but I hope it doesn't end up gathering dust if I can't get it working for my particular workflow/lifestyle profile. I learned on this forum that you can't sell your license on ebay, apparently, either.

Direct link Reply with quote
 

Mulyadi Subali  Identity Verified
Indonesia
Local time: 05:23
English to Indonesian
+ ...
+1 Sep 18, 2014

MamaG wrote:

I think it would be super-handy to have a poll or questionnaire that allows you to use a process of elimination in order to figure out what CAT tool is right for you as a function of what you tend to "do" as a freelancer. What your profile and lifestyle are, etc. I just bought SDL Trados Freelancer Studio 2014, and it was a big mistake, I can see now, even after a couple training seminars. The last I had worked with Trados was with the 2003 Workbench, I think it was called. For what you got paid (you don't get paid for all the fiddling with tags and errors and whatnot - you get paid for matches), the hassle was just not worth it. I chose SDL because it seems to be the industry standard, but the type of work I do does not really fit into an "industrial" context. My work is not the assembly-line sort. Trados is a very big, resource-intensive, expensive program that seems to be intended to be used with the kinds of ongoing projects/repeat jobs that large corporations generate, and with text that is already provided to you in a process-able format. I think the people who do well with it are people who work with it in companies or who have a lot of time to learn it (so can take some time off of income earning to actually sit down with it, study it, risk making mistakes, risk losing income etc.), a natural gift for understanding IT/technical matters in the first place, a few regular clients, same-ish material from those clients, and preferred limited focus as far as area of specialty. Given the type of work I myself tend to get (pretty varied subject areas, more than one language pair, small-ish docs in original or "only" versions that no one else has worked on yet, and text that cannot always be extracted unmessily from the source document), I should have opted for something much simpler. It has more features than I will ever begin to understand or use, even after two training courses, and was a huge investment and sacrifice for my family to make. I suspect I should have started with something much simpler and cheaper. I have asked whether they give refunds or not, but from what I have read on here, they don't do that, so I suspect I will have to just swallow it and try to recover. For the "little guy" freelancer with small-to-medium-sized varied clients and material, I would just not recommend that monster program. You do not need that many features, and it is designed for people who work on assembly-line type projects for large corporations that give regular same-ish work. As for which other tool would be good for me or for the other folks Trados hasn't worked out for, I don't know. I did a lot of research before buying it, and ultimately chose it because of its "established" nature, but I hope it doesn't end up gathering dust if I can't get it working for my particular workflow/lifestyle profile. I learned on this forum that you can't sell your license on ebay, apparently, either.

Well said. No need to have a huge Goliath when tiny David can deliver the same result. Every CAT tool is basically the same, i.e., get the translation done. Just pick one that can handles the most, if not all, the file formats you're assigned with.


Direct link Reply with quote
 

Mulyadi Subali  Identity Verified
Indonesia
Local time: 05:23
English to Indonesian
+ ...
Any CAT Sep 18, 2014

Working with PDF is always tricky. Most CAT tools can only work with true PDF, i.e. you can copy and paste the text. If you don't have true PDF, then you have to use OCR.
Unfortunately, even working with true PDF, there will still be some, even many, quirks to handle, such as redundant tags, line breaks, etc.
My favorite way of working with PDF is to convert it to text, losing all the formatting etc. Start the translation with any CAT that can handle TXT, which I believe many, if not all. Create the translated/target file, convert it to, usually, DOC/DOCX, then recreate the formatting.


Direct link Reply with quote
 

Oliver Walter  Identity Verified
United Kingdom
Local time: 23:23
Member (2005)
German to English
+ ...
Segmentation rules Sep 18, 2014

OmegaT may be good for some types of translation project but it has what for me is a fatal design fault: it insists on doing the segmentation (i.e. defining what is a sentence to translate) by using "segmentation rules". You can modify these rules to some extent, but they cannot be "intelligent" and they cannot take text formatting into account.
Three examples:
  • It cannot be told "do not translate text where the font is formatted red and strike-through" (some of my translation work contains text like this, and it must remain unchanged in the translation process).
  • Does a semicolon end a segment? In some cases I want
    First sentence; second sentence.
    to be considered to be 2 segments, or just one longer segment in other cases, and I can't specify a rule that will automatically recognize these cases.
  • The normal segmentation rules say that a dot (=period) followed by a space ends a segment (except for abbreviations like "Dr." which can be defined in the rules). If the source text accidentally omits the space between 2 sentences (this happens sometimes) I want to be able, at the time I am doing the translation, to specify that the segment ends at that dot, not the next one that is followed by a space.
All of the above requirements are met by WordFast (I use the Classic version which is a giant macro in Word, so a PDF would first need to be converted into a Word, or plain text, document) and possibly some other CAT tools.
As I progress through the text to translate it, I can say, in effect: "Right, stop translating here, and resume at this point later in the text." (That deals with the red strikethrough text).
I can, when the translation process reaches the segments concerned, say "This segment ends at the semicolon-and-space; that other one at a dot-and-space, not the semicolon-and-space that comes a few words earlier.
I can also say "End this segment at this dot (="period" in American) even though it is immediately followed by a letter - that's a typing error in the source text."
In brief: automatic segmentation is very nice, but I do not want to use a CAT tool where the segmentation process is entirely automated.
Oliver

[Edited at 2014-09-18 15:35 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 23:23
Member (2009)
Dutch to English
+ ...
CafeTran Oct 21, 2014

Hi Oliver,

I'm pretty sure CafeTran can do all those things too. CT uses rules, but you can join and split segments anywhere you want while working, and of course add rules as you translate (although these won't be applied until the next time you import/segment a text.)

Regarding your three points:
• It cannot be told "do not translate text where the font is formatted red and strike-through" (some of my translation work contains text like this, and it must remain unchanged in the translation process).
CT does not import hiddden text, so before I import the file I use TransTools to hide anything I don't want to translate. For example, it can hide all yellow highlighted text, all text that is not highlighted yellow, etc.
• Does a semicolon end a segment? In some cases I want
First sentence; second sentence.
to be considered to be 2 segments, or just one longer segment in other cases, and I can't specify a rule that will automatically recognize these cases.

In this case I just join and split the relevant segments on the fly.
• The normal segmentation rules say that a dot (=period) followed by a space ends a segment (except for abbreviations like "Dr." which can be defined in the rules). If the source text accidentally omits the space between 2 sentences (this happens sometimes) I want to be able, at the time I am doing the translation, to specify that the segment ends at that dot, not the next one that is followed by a space.
Same as above: I just join and split the relevant segments on the fly.
-----------------------------------------------------------------------------------------*

I think it would be cool if we had some kind of trick which would allow us to re-import (and hence re-segment) only specific ranges of text (or segments) in an already imported document. That way, if you ran into a problem, you could e.g. just create a new rule and re-import the problematic section. What do you think? Sounds like a worthy ‘Beijerdea’ to me

Michael


Direct link Reply with quote
 

Merab Dekano  Identity Verified
Spain
Member (2014)
English to Spanish
+ ...
pdf is a problem Oct 22, 2014

I tried with Trados Studio 2014 to import and translate a pdf file. Technically, it does it. In reality, it is not worth it, as the source text gets badly altered in the process. The most of the formatting is gone. Some words/phrases are missing. Tags are not in the correct place. In short, not very good idea. Actually, some agencies directly advise you not to translate converted pdf files.

Alternatives:

Option 1
Manually select all, copy and paste the text in a Word file. Go through it and "fix" formatting. watch out, some words/letters are altered in the process, so you will have to actually read entire document and fix the "mistakes". It is not bad idea at all, as reading a document before translating it gives me valuable insights into the context and about what to expect ahead. It is relatively quick.

Option 2
Do not use CAT tool. Go and translate from scratch using pdf as a source document on one screen and a Word file as a target on the other. This is quicker, but you will have no "advantage" of using TM, formatting, segmenting, etc.

Option 3
Transcribe into a Word file. This is the slowest way and if the file is large, I do not see it as a feasible option.

The bottom line is that importing a pdf file into ANY CAT tool calls for trouble; pdf, by definition, is not for processing in any conceivable way other than reading, and hopefully enjoying the text. pdf is a 'finished product.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 00:23
English
Option 4... Oct 22, 2014

... try a product called InFix which cann do a remarkable job of extracting the text. You translate the text and then InFix can import it back in again. PDF looks just like the original only translated.

http://www.iceni.com/

Translating PDFs are always a bad idea and should be a last resort.

Regards

Paul


Direct link Reply with quote
 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 20:23
English to Portuguese
+ ...
On Infix... Oct 22, 2014

SDL Support wrote:

... try a product called InFix which cann do a remarkable job of extracting the text. You translate the text and then InFix can import it back in again. PDF looks just like the original only translated.

http://www.iceni.com/

Translating PDFs are always a bad idea and should be a last resort.

Regards

Paul


Infix handles very well software-generated PDFs for translation, since it allows to cover all the DTP issues properly. Infix has a built-in OCR feature for scanned PDFs, however IMHO it is relatively slow, and its output is often quite bad (compared to OmniPage).

Nobody deserves being sentenced to do complex (or any) DTP work using MS Word. Human Rights advocates should have taken care of forbidding such cruelty against translators.


I developed a walk-through of the Infix process at http://www.lamensdorf.com.br/translating-a-pdf.html . This should give a clear idea of what it entails. Iceni tested the Infix workflow with Trados and DejàVu, while I use it with WordFast Classic, hence the choice is not a problem; Infix should work fine with other CAT tools as well.

I don't think translating (distilled, not scanned) PDFs is a bad idea at all, I do it quite often. However I'm very proficient with PageMaker (ergo InDesign); been using it for DTP since the days of icon-less Windows 2.01, which makes working with Infix quite easy for me. It's a matter of the translator deciding whether they want to go into DTP or not, because it is a different type of service.

Anyway, of course, I charge the usual per-word rate for translation PLUS a separate (usually per-page) rate for DTP adjustments on PDF files. No point in doing it for free (like I've heard that many desperate translators do) just because it's not strictly translation work.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Choosing a CAT tool for translating PDF's and Word documents

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search