This site uses cookies.
Some of these cookies are essential to the operation of the site,
while others help to improve your experience by providing insights into how the site is being used.
For more information, please see the ProZ.com privacy policy.
The instructions on the OmegaT site for installing tokenizers say you should select the appropriate tokenizer from the list. For my source language, Russian, there are two listed: the SnowballRussianTokenizer and the LuceneRussianTokenizer. What is the difference, and which one is the best to use? Or do they each have their own advantages?
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Susan Welsh United States Local time: 13:47 Member (2008) Russian to English + ...
I use lucene
Nov 3, 2011
My recollection of past discussions is that lucene has a "stop word" function that snowball does not (meaning it ignores little irrelevant words like "and" and "the" when matching segments). Someone will probably correct me if I'm wrong. You can try them both and see what you like.
I translate from Russian, and lucene works great for me.
Susan
Subject:
Comment:
The contents of this post will automatically be included in the ticket generated. Please add any additional comments or explanation (optional)
Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free