| MultiTerm Extract useless in my humble opinion || Jan 13, 2009 |
I find that, for what Extract is meant to do and the price you have to pay for it, it is pretty useless. I don't find that it is a good term extractor - it does skip stuff it shouldn't and it does add term candidates that are not really terms. In short, once the extract operation is done, you still have to invest time to sort the results.
For one thing, you need to figure out if you work on the kind of contracts that can put a term extractor to good use. If you translate shortish documents or if you translate documents that are not terminology oriented (marketing copy, for example, is not well suited for term extraction, nor for CAT tools for that matter), then chances are the expensive piece of software will be sitting on your desktop gathering dust. Term extractors are better suited for long documents with repetitive terminology, where terminology consistency is important. A few good examples are software manuals, employee manuals, and pretty much anything called a manual or a user's guide. Scientific text is also generally a good candidate for term extraction.
I translate mostly documents that are both large and require that the terminology be consistent. However, I realized after a while that term extraction software just doesn't cut it. The thing is, term extraction that suits your needs well simply cannot be achieved by automatic means - it always requires human intervention. I therefore prefer to use a concordancer to obtain a list of frequent strings and manually work my way through that list to establish and translate the terms for a translation project. It takes longer than using term extraction software, but the loss of time is quickly made up for by the precision thus achieved. With the method I have come up with in the past two years, I know that, once I'm done creating and translating my list of terms and I have started translating, I will not have to ever go back to terminology tasks, as pretty much everything I need is already in my termbase, without any of the clutter that is often introduced by term extraction software. Thus, I hardly ever need to take a break from translating to look up terms. This time saving makes up for the time spent on extracting the terms manually. The end result is that, on average, I spend as much time using my manual method as I would have spent using software, only the output is better.
If you would like to try a concordancer, I suggest you try AntConc. There are many others out there, but I find that AntConc has many scalable features that allow me to better adapt it to my term extraction needs. Keep in mind that the main purpose of concordancers isn't term extraction, so many concordancers are useless for this purpose.
As for Wordfast, I haven't tried it, but if memory serves, it does have a term extractor built in. I guess you would have to try the evaluation version to see if it is worth your while.
Just remember - no software will ever do your job in your stead. This is very true with term extraction. You are often better off finding a manual method assisted by software than using an automatic method.
| || || |