Searching words inside other words with TMLookup
Thread poster: Dominique Pivard

Dominique Pivard  Identity Verified
Local time: 00:06
Finnish to French
Mar 2, 2016

In Memsource and memoQ, you can do a concordance search (Ctrl+K) on a word located inside another word (useful in languages that make heavy use of compound words) by adding an asterisk before and after the word you are interested in. For instance, *tila* will find segments that contain urheilutilan:



I guess the same is possible with with András Farkas’ TMLookup, using Regex. As a Regex analphabet, I’m therefore asking: what would be the equivalent syntax with Regex?


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 00:06
Finnish to French
TOPIC STARTER
Highlight, Filter and -Filter buttons in TMLookup Mar 2, 2016

Oh, and another TMLookup question, while I’m at it: what are the Highlight, Filter and -Filter buttons for? How would you typically use them?

Any plans to add version 1.55 (currently available via the Dropbox link mentioned in this post) to its proper home page at FarkasTranslations.com (which current hosts the older 1.0 and 1.31 versions)?


Direct link Reply with quote
 
FarkasAndras
Local time: 23:06
English to Hungarian
+ ...
*tila* Mar 3, 2016

Well, the normal search mode cannot do this.* The regex mode can do searches like this. You don't need to enter anything special, just tila. You only need wildcards if you want to search for two strings with something in between. E.g. you want to find "special knowledge", "specialist knowledge", "specialized knowledge" and other variants. Then you'd enter "special.*knowledge". The full stop is the any-character wildcard, and the asterisk stands for "any number of occurrences". So .* is largely the same as the * in MQ. Note that regex search is extremely slow compared to normal searches. It's fine for a db with a hundred thousand entries, but it's not really practical if you have several million entries in your db. There is a regex cheat sheet in the Help menu.

The new version will eventually be hosted on my site, it's just that updating the site is a bit of a pain.
Highlight/Filter/-Filter help refine searches and find the relevant bits in your text. E.g. you do a search with a French search term and you know the English result you're looking for is a two-word term where you already know one word. Enter that in the highlight box, click highlight and all the hits will be highlighted in the list so they're easier to find in that wall of text. You also get stats at the top of the window in parens (highlight gives you stats of the hits in the displayed hit list, which is capped at 500 hits). Filter hides all the hits that don't contain the filter term and -Filter hides all that do. You can often get the same result by using the main search boxes but 1) you can't always do negative searches in the main search box and 2) Filter/-Filter usually executes much faster.


* That's because TMLookup does its searches using a special text search technology called FTS in the SQLite database engine. FTS makes word searches on very large databases extremely fast, but it has its limitations: it can't do fuzzy searches and it can't do in-word searches. CAT tools have other similar text search technologies in whatever database engine they use, which usually have fuzzy and in-word searches - at the expense of slower speed and larger file sizes.


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Searching words inside other words with TMLookup

Advanced search







Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search