Pages in topic:   [1 2] >
Terminology extraction software
Thread poster: Eleni Makantani

Eleni Makantani
Greece
Local time: 21:57
Member
English to Greek
+ ...
Feb 6, 2008

Dear all,

I am making an inquiry to see which terminology extraction software you use. I don't use one, but I am seriously considering getting me one.

Do you think that investing in a terminology extraction software is a good idea? Which one do you prefer? Are there any free solutions on the Internet?

If you don't use one, how do you deal with terminology extraction? My main client (a translation agency) wants to get a separate file with terminology with every project, and I am tired of doing all this manually (especially when I have to do it on a big project).

Many thanks!


Direct link Reply with quote
 

John Di Rico  Identity Verified
France
Local time: 20:57
Member (2006)
French to English
PlusTools Feb 6, 2008

Dear Eleni,
PlusTools (www.wordfast.net) has a terminology extraction tool. I haven't used it very much... it's free so have fun and let me know how it works!

Good luck,

John


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 14:57
English to French
+ ...
Automatic terminology extraction does not exist Feb 6, 2008

The process of terminology extraction, if you want the result to be reasonably usable, relies heavily on human intervention. Many attemps have been made at term extraction, and some resulted in quite interesting solutions. I haven't tried any of the commercial products (SDL PhraseFinder, SDL MultiTerm Extract, etc.), but from the freely available products, the one that came closest to a usable termbase was across (freelancers get it free in exchange for registering in the software editor's database). It is far from perfect, but I find it is much better than PlusTools. I am quite disappointed at PlusTools - it is so basic that it will propose mostly strings that are not even close to being a term. I can't blame them - a computer is totally devoid of intelligence, and it takes a lot of intelligence to identify terms.

The solution I have found for extracting terms at reasonable speed and which at the same time is really helpful in making sure you don't skip any terms is AntConc, which is a (free) concordancer. A concordancer will compile the statistics in a text and will tell you how many times a word or a string of words occurs in a document. You can define how you want the results to display, how many words there can be in a "term", etc. This will provide you with a list of words and word strings that are almost all term condidates. This is where you come in - you now need to compile a list of terms based on the results, and then you can translate them and make them into a TDB, a CSV or any other type of file that suits your needs. It sounds complicated - but once you learn to use AntConc, which in my case took about an hour, you'll see that it is actually very simple and logical.

All the best!


Direct link Reply with quote
 

Alexey Ivanov  Identity Verified
Russian Federation
Local time: 21:57
English to Russian
Probably waste of money Feb 6, 2008

I have 2 terminology extraction toos (3 counting the extraction tool of the machine translation application "Promt"): TermExtract for Trados Multiterm and "Create lexicon" function which is part of Deja-Vue.
The problem with them all is that they are rather mechanical: you choose the minimum and the maximum number of words which a term may consist of and they start the extracting. They extract all combinations of words within that range, first 1word terms, then 2 words terms, and so on. And they will include even the articles. So you will have to edit the results killing all the non-sensical terms consisting of combinations which make no sense at all. That takes a lot of time and I have come to the conclusion that it is much easier to create a term base while translating using MultiTerm when translating in Trados or SDLX and "Send to Lexicon" in Deja-Vue.

The most convenient of them all is the extraction tool of Promt. It shows the context so that it is easier for you to decide which terms can be useful, and and killing the non-sensical combinations is very easy. The problem with it however is that you can save the results only in MultiTerm 5 which is actually obsolete now, and the process of conversion into MultiTerm 7 is very complicated. But I would not advise you buying this application - it is useless for a professional translator. So think twice before you buy MultiTerm Extract which is very expensive.


Direct link Reply with quote
 
xxxLia Fail  Identity Verified
Spain
Local time: 20:57
Spanish to English
+ ...
AntConc Feb 6, 2008

Viktoria Gimbe wrote:
........

AntConc ... a (free) concordancer.

All the best!


Easy to use too:-)

I too have used AntConc to create word lists too, although not on a major scale, so won't comment other than that I read Victoria's contribution with interest.

If you are in Split in Croatia in September, we're running a workshop on corpus use that features AntConc:-) as part of METM08:
http://www.metmeetings.org/


Direct link Reply with quote
 

ViktoriaG  Identity Verified
Canada
Local time: 14:57
English to French
+ ...
The bigger the better Feb 6, 2008

Lia Fail wrote:

I too have used AntConc to create word lists too, although not on a major scale

[/quote]

Actually, AntConc gets better the larger your text is. The bigger your text sample, the better you can distinguish between term candidates and non-terms, because usually, the words and word strings that have the highest repeat rate are terms and not stopwords and other expressions. So, when you get your first results list in AntConc, if your text is fairly large, the best term candidates are always at the top of the list, and the worst term candidates are always at the bottom.

It will sound crazy, but I am reassured that I am not the only one using a concordancer to extract terminology - sometimes, when I figure out ways to work better or faster, I wonder if I am not doing things the weirdest way.


Direct link Reply with quote
 
Tatty  Identity Verified
Local time: 20:57
Spanish to English
+ ...
This weekend is the weekend Feb 7, 2008

I have several problems with my Trados which I have promised myself that I will sort out this weekend. Then I intend to try out the extraction tools I bought 2 years ago and have never used. I bought multiterm extract and phrasefinder as a deal and they cost me about €700 euros, reduced from €1000, so I am hoping that they are going to be of some use to me!

BTW, I hate it when agencies ask you to provide a glossary afterwards, I think that is their job. I never give them an exhautive list, 10 or 15 words tops.


Direct link Reply with quote
 

Anne Seerup
Ireland
Local time: 19:57
English to Danish
+ ...
Tedious work Feb 7, 2008

I bought Trados Multiterm Extract, and to be honest I very rarely use it.
The process is so tedious because you have to go through this LONG list of terms and verify and delete. This takes hours to do, so it would be easier to just look up the words once you encounter them in your text and enter them in your Termbase.
It is just when I am not busy working, I do prefer getting some exercise or doing other more fun things, than verifying long lists of terminology, so I just never really get around to it.

The occasion it did prove useful was when working frequently for a client with very specific terminology.


Direct link Reply with quote
 

Vladimir Shelukhin  Identity Verified
Local time: 21:57
English to Russian
+ ...
You've said it all Feb 7, 2008

Alexey Ivanov wrote:
…I have come to the conclusion that it is much easier to create a term base while translating using MultiTerm when translating in Trados or SDLX and "Send to Lexicon" in Deja-Vue.
The only sound solution today. And a very rewarding one.


Direct link Reply with quote
 

Anna Sylvia Villegas Carvallo
Mexico
Local time: 13:57
English to Spanish
Dear Eleni, Feb 7, 2008

This is just my opinion and I strongly respect my colleagues' recommendations, but my concept of "term extracting" or "terminology extraction" is to save or keep only the terms which could cause some trouble with my customer, or in order to accomplish a shared uniformity among my translations or translator fellows when working as a team.

For this, I don't need a long list of unrelated words, but just the technical terms related with my current translation or specializations.

I've been trying to learn how to manage MultiTerm Extract version 7.0.2 for three or more days and haven't got any success! Today, I got a new job, and happily went back to MultiTerm 5 which, yet old and paleolithic, is a fine application for the case. It works smoothly with all Trados applications, and works alone for whatever reference!

I'll stay with MultiTerm 5.

Tatty wrote:
BTW, I hate it when agencies ask you to provide a glossary afterwards, I think that is their job. I never give them an exhautive list, 10 or 15 words tops.

Yeahhh... Me too.


Direct link Reply with quote
 
xxxhazmatgerman
Local time: 20:57
English to German
automatic vs manual solution Feb 13, 2008

Dear asker, dear Ms. Gimbe:
without investing heavily in software I found the following a practicable way for my terminology work. Using parallel display of bitexts in pdf format with the Reader to locate terms. The refined search in either language pdf yields a window which comes close to that of a unilingual concordancer. The corresponding expression (rather than term) in the other language can then be scrolled. The finds can then be assessed. This happens on one monitor.
The second monitor shows the terminology database (Access, dbase, excel, StarWriter, ...) and I check/enter the solutions there. Apart from the database software all tools are open source or free.

If a more automated way exists I'd be most pleased to learn. However, automatic procedures tend to blind (me, for one) to the non-conforming isolated cases that are the really intriguing ones. Ms. Gimbe's "automatic extraction does not exist" statement probably aptly summarises the situation.
Regards


Direct link Reply with quote
 
David Turner  Identity Verified
Local time: 20:57
French to English
+ ...
Tim Craven's ExtPhr32... Feb 13, 2008

Eleni Makantani wrote:

Are there any free solutions on the Internet?



... is an excellent little freeware utility which "Extracts every word and every phrase up to a certain length that occurs at least a minimum number of times in a source text and that does not start or end with a stopword"

A English stopword list file is provided but you can find stopword lists for other languages on the Web.

http://publish.uwo.ca/~craven/freeware.htm

David Turner


Direct link Reply with quote
 
X Zhang
Local time: 02:57
English to Chinese
+ ...
one possible solution Feb 22, 2008

To work with the assistance of a highly-efficient and understanding secretary could be one of the possible solutions though it undoubtedly raises costs. Therefore, a husband and wife teamwork is ideal..

Direct link Reply with quote
 

Ivan Ivanov  Identity Verified
Local time: 21:57
German to Bulgarian
Termextract Feb 28, 2008

Great thanks to David.
This software works exellent.


Direct link Reply with quote
 

Simon Sobrero  Identity Verified
United Kingdom
Local time: 19:57
Italian to English
+ ...
SDL Phrasefinder works best but not for serious TMs May 20, 2008

See my postings e.g. http://www.proz.com/forum/smart_shoppers/72526-term_extraction_software_to_recommend.html

Would love to hear of anyone's experience with this product. It works very well up to a point, but try feeding in your master Translation Memory (which it claims it can handle) and be prepared for a big long crash. So what's the explanation? They don't know or seem to care. Bizzarre.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Terminology extraction software

Advanced search






memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



All of ProZ.com
  • All of ProZ.com
  • Term search
  • Jobs