Pages in topic:   [1 2 3 4 5] >
Lift technology - is it on its way?
Thread poster: Wojciech_

Wojciech_
Poland
Local time: 04:45
English to Polish
+ ...
Aug 21, 2015

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 04:45
English
It has been... Aug 21, 2015

pro-lingua wrote:

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.


... in Studio 2015. It is very cool and returns results from TM lookups (100% and Fuzzy) as well as concordance chunks. So when this is all combined with standard AutoSuggest Dictionaries, and even Machine Translation AutoSuggest and Regex AutoSuggest you have a very impressive resource at your fingertips.

It also means you can start with an empty TM and you get results immediately based on what's in your TM, and this is something many users wanted for a long time. The AutoSuggest Dictionaries are very cfocussed and still provide excellent suggestions but this does add to the overall solution.

Regards

Paul
SDL Community Support


Direct link Reply with quote
 

Roy Oestensen  Identity Verified
Norway
Local time: 04:45
Member (2010)
English to Norwegian (Bokmal)
+ ...
Same as Deep Mining in Dejavu? Aug 21, 2015

I get the impression that what you describe is something similar to what Dejavu calls Deep mining, where Dejavu tries to find the best translation from the context in the TM. It sounds very good, but apparently it doesn't quite deliver what you would expect.

I am not convinced by a good presentation which may give a special circumstance where it works well. It regretably doesn't necessarily follow that it will do so in general usage. Instead it often gives worse results than having it turned on.

So I would rather wait and see.


Direct link Reply with quote
 

Patrick Porter
United States
Local time: 22:45
Spanish to English
+ ...
already possible...sort-of Aug 21, 2015

pro-lingua wrote:
...from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation...


In general terms, this is basically what statistical machine translation accomplishes. You could get this kind of result by using your TMs to train a machine translation model. In fact, I do this all the time and it ends up being a really great resource to have while I'm working. The result is like an automatic recursive concordance search. Of course, there are many different ways you could implement that, with varying results, but there are some well-developed and mature SMT tools that make it easy.

I use the Moses SMT toolkit, which is relatively simple to use (once implemented) and has an easy-to-follow user manual if you are familiar with any Linux OS. There is also an open-source project called "Casmacat Home" which has a browser-based user interface to make uploading TMs and using them to train models fairly easy. The advantage to these is my MT models reside on my machine at all times and I don't even need internet access to use them.

As another alternative, it seems that Microsoft now has a portal for training your own private MT engines on their servers and then accessing via the Translator API. This may be simpler to use for some people.

I would recommend trying one of these methods. Even with relatively small TMs (well...small from the MT corpus point of view...like 50,000 segments), my results have been very effective as a resource. I mean...if you are realistic about the quality of the output, i.e. just looking for a way to help you automatically look up previously translated shorter subsegments, and not really looking to make your own full-fledged general MT engine, then an SMT tool works really well.

There is even a way to set up Moses so that you can update the models with every segment you translate, so it sort of learns as you work, without having to re-train the model every time you have new data. Right now I'm using this method in Trados Studio with a prototype plugin I've developed. My plan is to release this plugin as an open-source project in the next few weeks, so if you are interested, watch my GitHub profile for an update on this. I'm also thinking of making some videos/tutorials about this topic.

[Edited at 2015-08-21 12:48 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 03:45
Member (2009)
Dutch to English
+ ...
Agree with Roy Aug 21, 2015

I have to see it to believe it, although it does sound interesting, and I have heard good things about this "Lift" thing.

Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly", which a few users love, and most of us leave switched off. It really depends on how carefully you prepare the resources it feeds off of, and of course, the type of work you do (preferably highly standardised, repetitive, consisting of small chunks, etc).

Michael


Direct link Reply with quote
 

Meta Arkadia
Local time: 10:45
English to Indonesian
+ ...
Hits Aug 21, 2015

Michael Beijer wrote:
Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly"

Nope. It used to be called "Subsegment Matching" and is now "Hits." I have no idea how many CafeTran users use AA and/or Hits. I use AA extensively, but usually disable Hits, because the latter costs too much time and triggers too many false positives.

Hans


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 03:45
Member (2009)
Dutch to English
+ ...
@Hans: Aug 21, 2015

Meta Arkadia wrote:

Michael Beijer wrote:
Roy mentioned that it sounds like the Deep Miner in Déjà Vu. CafeTran has its own version, simply called "auto-assembly"

Nope. It used to be called "Subsegment Matching" and is now "Hits." I have no idea how many CafeTran users use AA and/or Hits. I use AA extensively, but usually disable Hits, because the latter costs too much time and triggers too many false positives.

Hans


Not sure is calling it merely "Hits" does it justice, but I see what you mean:

Both DVX and CT can auto-assemble stuff.

(1) In DVX, you can switch on "DeepMiner" to assist this process.

(2) The CafeTran counterpart is enabling "Fuzzy & Hits" in the the "Matching type" drop-down menu in the TM settings.

Both DeepMiner and "Hits" involve the CAT tool trying to guess stuff.


Direct link Reply with quote
 

Meta Arkadia
Local time: 10:45
English to Indonesian
+ ...
Explanation Aug 21, 2015

I'll try to explain the process(es), not to try to be the great educator, but more to see if I understand it myself. Which is why it's very simplistic, out of necessity. So please correct me if I'm wrong.

A CAT tool will first look if there are exact matches for the segment. If there aren't, some tools will try to find close matches, supplemented by marches in the termbase(s). This is still on segment level, and is usually called Auto-Assembly, either inserted, or not. If there are still missing parts, it will look for them in other segments of the TMs, thereby "leaving" the segment. Which is why it can take a lot of time, especially in the case of large and/or multiple TMs. This is called "Dynamic TM Analysis," or Deep Miner in DejaVu and Subsegment Matching (now Hits) in CafeTran. I like the term"Lift" because it the process "lifts' a part of another segment, and drops it in the current one.

Related is Auto-Suggest/Auto-Complete etc., which also looks for terms in other segments (and termbases), but only inserts the match when you start typing. It doesn't Auto -Assemble.

Hans



[Edited at 2015-08-21 14:55 GMT]


Direct link Reply with quote
 

Wojciech_
Poland
Local time: 04:45
English to Polish
+ ...
TOPIC STARTER
How I understand this. Aug 21, 2015

I actually think that subsegment matching is something different from Assemble function, but the first can be involved in the second one.

What I understand by Assemble function (in DVX and Cafetran) is the app trying to literally assemble the whole segment from various sources (MT, TM, Glossaries etc), while subsegment matching is trying to find smaller chunks, phrases from the TMs of the user.

In the past I remember Wordfast Classic had a function wherein if there were no 100% or fuzzy matches, WF looked for phrases and the user could adjust how long (in words) the phrases were to be. This, I believe, was the predecessor of today's subsegment matching.

As I understand, Lift searches for the phrases (that otherwise would produce no match, because the rest of the sentence is different) and ALSO highlights appropriate translation of the phrase in your TM's target language. As I mentioned before, MemoQ has the "guess the translation" function in its Concordance, but from what I could see it's very unreliable.
Lift, however, is able to provide appropriate translations more frequently. Thanks to this, the target translation can be fed to you e.g. via Autosuggest.

Anyway, look at this video:
http://www.kftrans.co.uk/lift/


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 04:45
Member (2006)
English to Afrikaans
+ ...
WFC Aug 21, 2015

pro-lingua wrote:
In the past I remember Wordfast Classic had a function wherein if there were no 100% or fuzzy matches, WF looked for phrases and the user could adjust how long (in words) the phrases were to be. This, I believe, was the predecessor of today's subsegment matching.


I'm a long-time WFC user and I don't recall such a feature. I do recall, however, an old feature called "subfuzzy matching", wherein if the segment itself was very short (2-4 words), WFC would propose matches that it guessed contained useful words even though the match was below the fuzzy threshold. I also recall that it used to be possible to set WFC's fuzzy threshold really, really low, which would yield fuzzy matches that contained only phrase matches, but it wasn't an intelligent phrase matching service. The current version of WFC can't find matches below 50%, at all.


[Edited at 2015-08-21 19:31 GMT]


Direct link Reply with quote
 

Meta Arkadia
Local time: 10:45
English to Indonesian
+ ...
Find & Replace Aug 22, 2015

Meta Arkadia wrote:
...and is usually called Auto-Assembly...


Please replace all instances of "Auto-Assemble" by "Fuzzy Matches," it may make slightly more sense.

Cheers,

Hans


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 05:45
Finnish to French
Which site? Aug 22, 2015

pro-lingua wrote:
I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift

Would you care to share this source?


Direct link Reply with quote
 

Wojciech_
Poland
Local time: 04:45
English to Polish
+ ...
TOPIC STARTER
Link Aug 22, 2015

Dominique Pivard wrote:

pro-lingua wrote:
I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift

Would you care to share this source?


It's one of the articles listed below the video that I gave the link to in my previous post. Sorry I'm writing from my mobile, so it's slightly difficult to provide the link directly.


Direct link Reply with quote
 

Dominique Pivard  Identity Verified
Local time: 05:45
Finnish to French
Already in 2014? Aug 22, 2015

SDL Community wrote:
It has been ... in Studio 2015.

According to this paper, it was already implemented in Studio 2014. Did 2015 bring something new (in that respect) compared to 2014?


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 03:45
Member (2009)
Dutch to English
+ ...
"… SDL has acquired the technology …" :-( Aug 22, 2015

pro-lingua wrote:

A question to SDL specialists here. I've recently come across a website where a unique technology for finding subsegment matches is described. It's called Lift and from what I understood with its help a translator will be able to retrieve numerous subsegment matches from their TM together with their translation (MemoQ offers something similar called "Guess translation" when using Concordance, but it's very inaccurate).

I have learnt that SDL has acquired the technology and my question is - will it be available soon in the next incarnation of Studio? I have seen a video where the technology is already implemented into one of the versions of Studio and the shown results were truly impressive.

Thank you.


A pity. Wonder when they will gobble up memoQ, Wordfast and Across too.

SDLX + Trados
->
SDL Trados Studio
->
SDL Trados memoQ Studio
->
SDL Trados memoQ Wordfast Studio
->
SDL Trados memoQ Wordfast Across Studio


Bad for the industry.

[Edited at 2015-08-22 07:59 GMT]

[Edited at 2015-08-22 07:59 GMT]


Direct link Reply with quote
 
Pages in topic:   [1 2 3 4 5] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Lift technology - is it on its way?

Advanced search







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search