Pages in topic:   [1 2] >
Language Search Engines?
Thread poster: Alan Campbell

Alan Campbell  Identity Verified
Local time: 03:43
Russian to English
+ ...
Aug 15, 2006

I thought this would fit here rather than any of the tech forums, but feel free to move it if I judged wrongly.

I noted that there is a new sponsored link at the bottom of my KudoZ e-mails to Lingotek, a provider of a Language Search Engine solution. The idea is really appealing and, since I'm presently researching translation technologies for a client of mine, it's rather timely as well.

A search on ProZ for Language Search Engine, LSE or Lingotek yields no hits, so I'm guessing that it's brand new technology (at least to consumers at any rate). Has anyone any experience with this kind of technology? is it the future? Is TM software on the way out, or are we looking at a combination of TM with LSE? Has anyone tried the trial of Lingotek and, if so, how did you find it?

I'll give the trial a go myself when I get back from my hols towards the end of the month, but I thought it would be interesting to get a discussion going about this.

Here is a link to the provider:
http://www.lingotek.com/

There's a PDF document there on the topic of TM vs LSE, but it's somewhat biased in favour of LSE (as you would expect), so some more objective opinions would be welcome.

Alan


Direct link Reply with quote
 

Ivaneide
Brazil
Local time: 23:43
English to Portuguese
+ ...
I liked how it was explained and tried it. Aug 16, 2006

Hello, Allen

I really can not give you much info on this subject. Like I said, read the ad and liked the idea. When I went to try it, it only gave two choices of format to export, which I can not recall at the moment, rtf....something like that. Since most of the work I handle is in Word, I tried to export that and it just did nothing. I clicked on there help button and there was nothing to read there. I gave up and went on my mary way. So looks like it is something new they came up with and are trying to get it off the ground. Anyway, good luck. Hope to see more replies here because it did sound promising.

Ivan


Direct link Reply with quote
 

Ivaneide
Brazil
Local time: 23:43
English to Portuguese
+ ...
Sorry, I misspelled your name...Alan Aug 16, 2006

Good luck

Direct link Reply with quote
 

Alan Campbell  Identity Verified
Local time: 03:43
Russian to English
+ ...
TOPIC STARTER
Title Aug 16, 2006

Thanks for checking it out Ivan.

When I went to try it, it only gave two choices of format to export, which I can not recall at the moment, rtf....something like that. Since most of the work I handle is in Word, I tried to export that and it just did nothing.


I read on the Lingotek site that one must install OpenOffice before attempting to align Word docs. Hang on and I'll see if I can find it...

Ah yes, here is the info (from the alignment page: http://www.lingotek.com/technology_alignment.html):

To import Microsoft Office proprietary formats into Lingotek users must go to http://www.openoffice.org, download OpenOffice (a free open source product sponsored by SUN Microsystems) and have it running when launching LingoAlign. This enables filters in OpenOffice to convert the Microsoft Office formats so they can be used by LingoAlign and then Lingotek. Always remember to have OpenOffice running when importing Microsoft Office documents into LingoAlign.

So, there is a bit of a barrier to entry IMO, since most translators I presume use MS Word. I think what's important now is their support of XLIFF (http://www.opentag.com/xliff.htm). I've read up a bit on that, and, from what I can gather, it's an open standard for extracting and exchanging content between applications.

As I said, I'll give LSE a whirl when I get back from my hols (off tomorrow - yay!)

Alan


Direct link Reply with quote
 
Barnaby Capel-Dunn  Identity Verified
Local time: 04:43
French to English
A bad start Aug 16, 2006

Hi Alan
Very interesting indeed inasmuch as my untutored brain can understand it.
BUT BUT BUT they really shouldn't make a product available before it's ready to go. In particular, it's quite absurd to launch a product of this kind without a Help file.
For what it's worth, the man behind LSE would appear to be Timothy R. Hunt, formerly of Translators Intuition (which appears to have sunk without trace).


Direct link Reply with quote
 
xxxmediamatrix
Local time: 22:43
Spanish to English
+ ...
Open source or closed shop? Aug 16, 2006

Alan Campbell wrote (quoting www.lingotek.com):

To import Microsoft Office proprietary formats into Lingotek users must go to http://www.openoffice.org, download OpenOffice (a free open source product sponsored by SUN Microsystems) and have it running when launching LingoAlign. This enables filters in OpenOffice to convert the Microsoft Office formats so they can be used by LingoAlign and then Lingotek. Always remember to have OpenOffice running when importing Microsoft Office documents into LingoAlign.


As is so often the case with so-called 'open source' software, they are open only w.r.t. other open softwares and are effectively a closed shop w.r.t. proprietary systems. That wouldn't be a problem if 95% or more of the world PC community used wasn't using ... proprietary systems.

I also had a quick look at the site and rapidly decided I have other priorities.

MediaMatrix


Direct link Reply with quote
 
xxxmediamatrix
Local time: 22:43
Spanish to English
+ ...
Will you ride the front of this new wave or follow in the steps of others as it rolls forward?* Aug 17, 2006

* Concluding sentence of the document discussed in this post.


Having a few minutes to spare - and being curious by nature - I've gone back to the lingotek site and read the whole of the document purporting to explain LSE: http://www.lingotek.com/lsereplacetm.pdf

Bearing in mind this is supposed to be the best thing since sliced bread in the translation world, and it was supposedly written by an authority on translation, I was interested to read the following statements (extracts from page 1):

"Traditional translation memory systems have been focused on the reuse of previous translations primarily at the sentence and heading level. It has provided a cost savings and an increase in quality through consistent use of these translations. They have been augmented with terminology management, Global Information Management (GIM) and translation project management systems to provide additional value to language professionals." (very first paragraph of the document)

Questions:

What does It refer to? - "Traditional translation memory systems" is plural.

They - Back to the plural - but what does 'They' actually refer to?

a cost savings - ????


"Still, less than 5% of the content in the world is exact match repeat sentences even if you include fuzzy matches."

Question: Did that make sense?

"A Language Search Engines (LSE) follows this pattern of indexing and accessing linguistic knowledge from a growing repository of multilingual content. What language
professionals need most is a fast, easy and inexpensive way to access relevant bilingual knowledge Isn’t a translator’s skill a combination of their ability to write well combined with their cumulative bilingual knowledge?"


Questions:
A Language Search Engines (LSE) - 'A' is followed by EngineS, plural?

knowledge Isn’t - A full stop (period for US readers ) would not go amiss here.

a combination of their ability to write well combined with - Hmmm. Idiomatic English at its best! And they're talking about 'writing well' ...

I'll not bore everyone with the other ten pages. Suffice it to know that there are literally dozens of similar errors in the English. OK (bowing to pressure from the populace) here's one more: "Fuzzy matches can be helpful, but have no basis in meaning." It's fuzzy? You don't understand it? You'd like more context? If you go to page 4 you'll find that the context doesn't help at all.

Several things strike me about this text.

A. Although there is no statement to this effect, it appears that this text may have been produced using LSE, possibly from a French-language source (I suggest French as a possible source language on reading the English errors in example sentence 3, on page 4: "The wild child is destroying his new toy he got for Christmas at the dismay of his parents." (idiomatic English would use 'the' instead of 'his', and '... to the dismay ...' and these are typical errors 'twixt French and English.)

If that is the case, I don't understand why lingotek didn't either:

- State that it was an uncorrected LSE translation, in which case they might actually impress a lot readers because, if the truth be told, it isn't that bad (compared to Babelfish, for example).

or

- Have it proof-read as befits any explanatory text on a public website selling language technology - in which case they might stand a better chance of persuading us they're able to recognise good English when they see it.

B. It is very dull reading - not because the subject-matter is unduly boring but because it uses a very restricted vocabulary.

C. A lot of it is unclear, ambiguous - or just plain nonsense. One example paragraph (page 1, para 2):

"Still translation memory systems are not used for all translation projects. Most organizations state that less than 20% of their projects are candidates for TM. Of course some translation providers who focus primarily on revisions of previous source documents may have 80% or higher usage. Still, less than 5% of the content in the world is exact match repeat sentences even if you include fuzzy matches."

Still - As in 'All the same...', or as in 'Even today ..' - or something else?

Most organizations state - Nonsense! Most organizations have never expressed a view on the subject.

How does 80% in the third sentence relate to 20% in the second?

Still - As in 'All the same...', or 'Nonetheless ...', or something else? (I've already raised doubts about the rest of this sentence - 'nuf zed!)

Conclusion: As I wrote in my first post in this thread: "I have other priorities". I would add, however, that the LSE concept looks interesting - it's a pity that lingotek seem to be shooting it in the head - and themselves in the foot!

MediaMatrix

Disclaimer: The extracts quoted here, without permission and with some words/phrases highlighted by me, are reproduced solely for the purpose of illustrating this contribution to the discussion in this thread, concerned with the possible role of LSE in professional translation.


Direct link Reply with quote
 

Marinus Vesseur  Identity Verified
Canada
Local time: 19:43
English to Dutch
+ ...
Lingotek mumbo jumbo. Is this really LSE? Aug 25, 2006

mediamatrix wrote:
I've gone back to the lingotek site and read the whole of the document purporting to explain LSE: ...
Bearing in mind this is supposed to be the best thing since sliced bread in the translation world, and it was supposedly written by an authority on translation...

Thank you for your extensive report on Lingotek and LSE. I spent a few hours on it this morning, as curious as you were, and got as far as actually working on a doc file I had uploaded myself. What happened next is quite revealing about the true intentions of the Lingotek 'inventor': there were no matches. I tried two relatively common language pairs, but nothing was found, whether long or short segment.
So I wrote Lingotek, told them what had happened and asked them what I had done wrong.
This is an excerpt of their reply:
"We are definitely early in our progress at Lingotek and need to have more information in our databases. Our hope is that sharing of data will happen as users sign up for a free trail and start projects and upload their Translation Memory."
The true intention of Lingotek seems to be to gather (steal?) TM content and share it with the other subscribers. There are a thousand reasons why this is not a good idea besides probably being illegal. And it would have very little to do with the declared intention of using open source web content as a basis for translation.
The more I think about it, the more I'm convinced there's something fishy here.
If anyone has more information, please share it with us.


Direct link Reply with quote
 

sonnyz
English to Macedonian
hope this clarifies things... Aug 25, 2006

Marinus Vesseur wrote:
The true intention of Lingotek seems to be to gather (steal?) TM content and share it with the other subscribers.


I think you are misunderstanding the whole concept of how the tool works. When you upload your document to translate it you have the option to share your matches from that document with every other user, or not. If you choose not to share it, you will be the only one that will ever see that content again (unless you assign someone permission to recieve matches from that document.) The more people share the more content is accessible from the start. And since they don't use machine translation you would probably assume your document isn't going to retrieve matches after just a couple weeks after the tool is released. Remember, Rome wasn't build in a day...

-SonnyZ

"Be who you are and say what you feel, because those who mind don't matter and those who matter don't mind."


Direct link Reply with quote
 
David Sirett
Local time: 04:43
French to English
+ ...
Money for nothing? Aug 27, 2006

[quote]sonnyz wrote:

I think you are misunderstanding the whole concept of how the tool works. When you upload your document to translate it you have the option to share your matches from that document with every other user, or not. If you choose not to share it, you will be the only one that will ever see that content again (unless you assign someone permission to recieve matches from that document.) The more people share the more content is accessible from the start. And since they don't use machine translation you would probably assume your document isn't going to retrieve matches after just a couple weeks after the tool is released. Remember, Rome wasn't build in a day...

-SonnyZ

So they are basically asking $29.99 per month for a tool with zero content, i.e. of no use whatsoever at the moment?

Now that's what I call a business plan!

David


Direct link Reply with quote
 

Luca Tutino  Identity Verified
Italy
Local time: 04:43
Member (2002)
English to Italian
+ ...
It looks interesting (and working) Aug 28, 2006

I decided to try and just started a project with lingotek - the basic user interface seems to work.

Right from the start I notice the following problems:

- apparently there are no shortcuts
- there is no button for copying source into target
- tmx upload lasted a long time
- after tmx upload I received an error message
- there is no way how much I have stored in my private index
- Te equivalent of TM "concordance" is restricted to portions of the source sente, i.e. searching the index for a word combination not included in the source sentence is not possible.

Quite a long list for a 2 sentences trial! However I still hope I can solve some of these troubles.

[Edited at 2006-08-28 00:31]

[Edited at 2006-08-28 00:35]


Direct link Reply with quote
 

Luca Tutino  Identity Verified
Italy
Local time: 04:43
Member (2002)
English to Italian
+ ...
I do not think they intrend to steal Aug 28, 2006

[quote]Marinus Vesseur wrote:

mediamatrix wrote:
The true intention of Lingotek seems to be to gather (steal?) TM content and share it


Lingotek give ample reassurances against this on their site. This part made perfect sense to me - although it would probably require a more careful perusal before uploading sensible materials.


Direct link Reply with quote
 
Barnaby Capel-Dunn  Identity Verified
Local time: 04:43
French to English
Indexes Aug 28, 2006

The product is quite clearly in alpha rather than beta at the moment, and I think they would have done better to hold back a little before allowing themselves to be shot down in flames!
For me the main question mark surrounds the "indexes" supposedly available (at a fee). I don't think many of us would be prepared to take the plunge without knowing a lot more about these.


Direct link Reply with quote
 
Steve_W
German to English
What is the big idea anyway? Aug 30, 2006

From the way it is being touted, I was expecting a system that would automatically index and align existing multilingual web content. But, after a brief trial, I can't see any appreciable advantage over conventional TM systems. It can only offer segments manually entered into the system.

Also, many companies would be loathe to place potentially sensitive documents on an external web server.


Direct link Reply with quote
 

Rodolfo Raya  Identity Verified
Local time: 23:43
English to Spanish
Does it work? Aug 30, 2006

Hi,

Just tried translating a short RTF from Chinese to Spanish. I translated several almost identical sentences in meaning and got zero hits from the index.

For example, I translated:

"第一步" (Step 1) as "Paso 1"
"第二步" (Step 2) as "Paso 2"
"第三步" (Step 3) as "Paso 3"

and nothing was offered when I reached "第四步" (Step 4).

Any TM engine would be able to offer a fuzzy match from such simple sentences, but the LSE did not find anything in its index.

Any ideas?

Rodolfo


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:

Moderator(s) of this forum
Maria Castro[Call to this topic]

You can also contact site staff by submitting a support request »

Language Search Engines?

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums