A free term extraction tool?
Thread poster: eileenPL

eileenPL
Local time: 06:31
Polish to English
Jan 28

Hi all,

could anyone recommend a free extraction tool for terms and phrases that can be used offline?

Regards,
E.



[Edited at 2018-01-28 21:21 GMT]


Direct link Reply with quote
 

Jason_P
South Korea
Local time: 13:01
English to Korean
it's not for Online though, Jan 29

https://www.proz.com/forum/sdl_trados_support/322396-word_cloud.html

Direct link Reply with quote
 

Danesh
Local time: 09:01
English to Farsi (Persian)
+ ...
Okapi Rainbow Jan 29

Rainbow — is a GUI application to launch various utilities related to translation and localization tasks, such as: Text extraction (to XLIFF, OmegaT projects, RTF, etc.) and merging, pre-translation, encoding conversion, terms extraction, file format conversions, quality verification, translation comparison, search and replace on filtered text, pseudo-translation, and much more. Using the framework's pipeline mechanism, you can use Rainbow to create chains of steps that perform a custom set of tasks specific to your needs.
http://okapiframework.org/wiki/index.php?title=Rainbow


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 06:31
Member (2006)
English to Afrikaans
+ ...
@Eileen Jan 29

eileenPL wrote:
Could anyone recommend a free extraction tool for terms and phrases...?


From what data sources will you be extracting?


Direct link Reply with quote
 

Rossana Triaca  Identity Verified
Uruguay
Local time: 01:31
Member (2002)
English to Spanish
In addition to Okapi... Jan 29

you can also use the old free version of Xbench.

But the tool I really like the most is AntConc; it's not exactly a term extraction tool, but the word list and, particularly, the "keyness" lists are great to this end, and the video tutorials are easy to follow for newcomers to corpus analysis.

For specialized texts, I find that having a baseline like the Brown corpus is invaluable!


Direct link Reply with quote
 

eileenPL
Local time: 06:31
Polish to English
TOPIC STARTER
.doc files Jan 29

Samuel Murray wrote:

eileenPL wrote:
Could anyone recommend a free extraction tool for terms and phrases...?


From what data sources will you be extracting?


I'd like to extract phrases from .doc files. I'm a Trados user, but when I think of manual copying terms from files into the translation memory... it'd take ages!



Thanks for all the replies!

[Edited at 2018-01-29 20:07 GMT]


Direct link Reply with quote
 

Jason_P
South Korea
Local time: 13:01
English to Korean
try mine. Jan 30

eileenPL wrote:
I'd like to extract phrases from .doc files. I'm a Trados user, but ...

It works well with SDL Trados Studio project file. It means you can use it for the all file formats which SDL Trados Studio accepts.

regards


Direct link Reply with quote
 

David Turner  Identity Verified
Local time: 06:31
French to English
+ ...
You could try my PhraseMiner tool: Jan 30

eileenPL wrote:

Could anyone recommend a free extraction tool for terms and phrases...?


http://asap-traduction.com/PhraseMiner

From a Word document, it extracts "internal" or "intra-document" fuzzy matches, sentences containing 5 or more common consecutive words, sentences that are subsegments of longer sentences, terms containing at least two or more words and appearing two or more times using stop-word lists, sentences containing two more of such extracted terms, etc.

David Turner


Direct link Reply with quote
 

Arianne Farah  Identity Verified
Canada
Local time: 00:31
Member (2008)
English to French
There's a new free app in the SDL store Jan 30

It works quite well - takes a little tweaking at first to remove common words, but you can save those custom exclusion dictionaries and reuse them, so you'll only have to exclude 'that' 'which' 'will' 'allow', etc. once. You set your minimum word length, your minimum amount of times it's repeated and it'll generate this cloud of words, you can then whittle it down before creating an sdxliff from it that you'll use to populate a glossary - it sounds like a lot of steps, but it's quite painless


http://appstore.sdl.com/language/app/projecttermextract/817/


Direct link Reply with quote
 

Jason_P
South Korea
Local time: 13:01
English to Korean
functionality limited. Jan 31

Arianne Farah wrote:

It works quite well - takes a little tweaking at first to remove common words, but you can ...
http://appstore.sdl.com/language/app/projecttermextract/817/


It can not extract multiple words (2 words or 3 words pharases).
It sees only 1 word.

[Edited at 2018-01-31 10:18 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

A free term extraction tool?

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search