Technical question about SDL Studio "events"
Thread poster: Mpoma
Mpoma  Identity Verified
United Kingdom
Local time: 10:38
French to English
May 29, 2016

Dear all,

I am underwhelmed with SDL Studio (2014)'s facilities for concordance searching and termbase searching.

Over the years I have put together a French -> English table in an Access dbase (Access 2000!), currently with about 30,000 French "head word" entries (although the English definition side can sometimes be quite large).

About a year ago I decided to write a Java application which would "reverse index" all the words in this dbase table, including all the words in the "definition" side. By "reverse index" I mean using the very powerful Lucene technology, similar to the technology which lies behind Google searching and all that sort of stuff.

Lucene uses things like "stemming", so that (in English) "approve" and "approval" would probably both be stored as "approv"... It also uses very clever algorithms for "scoring" individual documents held in its index. It can be "forgiving" (you don't have to have a perfect match). Above all it is the technology you absolutely need to have if you are searching for multi-word terms.

In French, for example, if I enter "juge référé", it will list many entries with "juge" or "jugement" or "référence" ... but at the TOP of the ranking it will consistently list "juge des référés" ("urgent injunctions judge"), because Lucene's index tells it that that particular entry contains both sought terms (or at least "juge" and a French-stemmed version of "référés", e.g. "refer" maybe).

What I want to do now is to make a comprehensive, automatic search using this Lucene index, every time I move to a new segment in SDL Studio, i.e. a search for all the new terms which appear in the source of the new segment. This can be guaranteed to produce far higher quality results than a dull-witted SDL concordance or TermBase search.

I'd just like to know if anyone knows if there is a way of "trapping" SDL events: what I want to do is having something "listening" for the the "event" of moving to a new segment, and on detecting such an event it should take all the source text in this new segment and do a series of queries on the Lucene index ...

PS Does anyone have a view why SDL is so late to discover Lucene-style "reverse index" technology? It seems inexplicable, as it is a perfect match for the intense language search work which translators do all the time...

[Edited at 2016-05-29 16:12 GMT]


Direct link Reply with quote
 

Ben Senior  Identity Verified
Germany
Local time: 11:38
German to English
APIs will do that May 29, 2016

If you are wanting to trap events in Studio you need to download the SDK and the APIs. Then using the APIs you will be able to write a standalone app or a Studio plug-in to trap various events. But you should be able to program in C# to do this.

Direct link Reply with quote
 
Mpoma  Identity Verified
United Kingdom
Local time: 10:38
French to English
TOPIC STARTER
thanks May 29, 2016

thanks... hmmm, unfortunately I know nothing about C#... strictly Java and Python (and Jython).

Having said that... do you have any pointers about how to get started doing this kind of "plug-in" stuff with SDL Studio... an example of a simple one, maybe?

[Edited at 2016-05-29 19:19 GMT]


Direct link Reply with quote
 
Mpoma  Identity Verified
United Kingdom
Local time: 10:38
French to English
TOPIC STARTER
Autohotkey to the rescue May 30, 2016

Hmmm... rather than spending the next 6 months on learning about SDL plugins, this Autohotkey script will do pretty much what I want!

^J::
Send ^{enter}
Send !{insert}
Send ^a
Send ^c
Sleep, 50
Send ^1
Send {left}

Ctrl-J then moves you to the next segment, copying the source to the clipboard, and then inserting the first match if there is one.

With a bit of luck a Java app can then have a listener which listens for clipboard changes, and performs intelligent Lucene querying of the source text of the new segment, relative to any loaded TMs and external vocab sources...

Incidentally, another thing about SDL and its dull-wittedness: it has occurred to me that in a situation like this, if the translator is working on Segment S, you might actually expect that searching and sequence-identification and Lucene querying relative to Segment S+1 might be going on *in the background*, ... in anticipation of you moving to Segment S+1 after you've finished with Segment S!

The amount of analysis would then potentially be colossal: seconds of processing time.

If I were a megacorporation rather than silly little me you might possibly think that I might have thought of this some time over the past 20 or more years!


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Technical question about SDL Studio "events"

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search