https://www.proz.com/forum/omegat_support/147806-omegat_20_released.html

OmegaT 2.0 released
Thread poster: Vito Smolej

Vito Smolej
Germany
Local time: 18:11
Member (2004)
English to Slovenian
+ ...
Oct 11, 2009

Dear All,

The 2.0 version of OmegaT is now released as a stable version, including a
revised manual.

Compared to the previous 1.8, 2.0 offers 39 functional enhancements.

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together
... See more
Dear All,

The 2.0 version of OmegaT is now released as a stable version, including a
revised manual.

Compared to the previous 1.8, 2.0 offers 39 functional enhancements.

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together with large translation memories
(e.g., 63 MB, 20,000 entries). The on-demand computation is still very fast,
and the difference isn't usually noticeable.

The Editor has been rewritten, providing enhanced features for RTL
languages.

Using OmegaT-tokenizers (http://sourceforge.net/projects/omegat-plugins),
OmegaT 2.0 can compute fuzzy matches and glossary matches based on stemming,
which can largely improve matching in most languages. "Stop words" are also
ignored in fuzzy matches for a number of languages, further improving the
matches.

OmegaT supports dictionaries in StarDict (http://stardict.sourceforge.net/)
format.

OmegaT now allows getting a machine translation of the current segment with
Google Translate.

There are new filters for QuarkXPress Copy Flow Gold, allowing to use OmegaT
for DTP projects, SubRip subtitles (SRT), LaTeX, Android resources and ResX
resources. The PO filter now loads existing translations.

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.

Stability has also be improved, with several important bug corrections.

As part of these enhancements, OmegaT now requires Java 1.5.

Compared with the previous 2.0.4 update 1, the new stable 2.0.5 contains a
revised manual and a command line feature to generate pseudo translated
TMXs.

OmegaT 2.0.5 can be downloaded from
https://sourceforge.net/projects/omegat/files/

... as per broadcast by Didier

[Edited at 2009-10-11 13:05 GMT]
Collapse


 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:11
Member (2006)
English to Afrikaans
+ ...
Some notes Oct 11, 2009

VitoSmolej wrote:
OmegaT supports dictionaries in StarDict (http://stardict.sourceforge.net/)
format.


Note that not all StarDict dictionaries work in OmegaT. Apparently there are several dialects (different subformats) of StarDict, and OmegaT works only with some of them. There is no way to tell which dictionaries will or will not work -- the only way to tell is to try to use it.

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.


Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it.

...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs.


Do you happen to know where in the user manual this procedure would be described?

Samuel


 

Didier Briel  Identity Verified
France
Local time: 18:11
English to French
+ ...
Java Web Start does download the program Oct 11, 2009

Samuel Murray wrote:

OmegaT is available as a Java Web Start application
(http://omegat.sourceforge.net/webstart.html), allowing to use it without
any installation.


Just in case anyone isn't familiar with Java Web Start, well, it doesn't install OmegaT on your computer but it does download the entire program every time you want to use it. So this option wouldn't save you from having to download it -- it simply saves you from having to install it.

No, it does download the program in a "Java cache". It only downloads it again if there are changes (thus providing automatic updates).


...the new stable 2.0.5 contains a ... command line feature to generate pseudo translated TMXs.


Do you happen to know where in the user manual this procedure would be described?


As described in changes.txt:
- Generate pseudo-translated tmx
(see documentation->translation memories->pseudo-translated memory)

Didier


 

FarkasAndras  Identity Verified
Local time: 18:11
English to Hungarian
+ ...
Size matters Oct 11, 2009

VitoSmolej wrote:

The loading and indexing system has been completely rewritten, providing "on
demand" matching. As a result, load time should now be under a minute in
most cases. Memory consumption has also been reduced, allowing to load large
projects (e.g., 300,000 words) together with large translation memories
(e.g., 63 MB, 20,000 entries).


Am I the only one who finds this woefully inadequate?
I mean, even leaving aside large projects that generate large TMs, just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle.
This is 2009, people are using large memories. If you overhaul your TM handling solutions, you should make sure they can handle a million or so TUs.

Anyway, it's good to see OmegaT development continue.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:11
Member (2006)
English to Afrikaans
+ ...
Some ideas Oct 11, 2009

FarkasAndras wrote:
I mean, ... just the Acquis TM in itself is about 1 million TUs, and if you add a TM created from the Europarl corpus plus a bit of this and that you can easily get to 10 times what OmegaT claims to be able to handle. ... This is 2009, people are using large memories.


Personally I think there comes a point at which a standalone TM program is no longer sufficient, and it becomes necessary for a program to connect to the TM server, if the user wants to use large TMs.

I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


 

Laurent KRAULAND (X)  Identity Verified
France
Local time: 18:11
French to German
+ ...
Indeed ;) Oct 11, 2009

FarkasAndras wrote:


Anyway, it's good to see OmegaT development continue.


And thanks for the information, Victor!

Samuel Murray wrote:
I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


What would be the need to host such a oversize TM on a freelancer's computer/storage device anyway? Just wondering (not a question of capacity: I have 1 TB at my disposal)...

[Edited at 2009-10-11 21:03 GMT]


 

Vito Smolej
Germany
Local time: 18:11
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
Documentation as PDF Oct 12, 2009

see my profile here:

http://www.proz.com/profile/91005 *

or use the URL

http://www.textnart.de/OmegaT.pdf

I would appreciate to hear about omissions, inconsistencies etc.

Regards

Vito

* - plug it here, just to improve my
... See more
see my profile here:

http://www.proz.com/profile/91005 *

or use the URL

http://www.textnart.de/OmegaT.pdf

I would appreciate to hear about omissions, inconsistencies etc.

Regards

Vito

* - plug it here, just to improve my Page rank (g)
Collapse


 

Vito Smolej
Germany
Local time: 18:11
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
re Acquis TM Oct 12, 2009

Samuel Murray wrote:
I have no idea how large the Acquis TM is, but let's suppose its 4 GB. How long do you think would it take for a CAT tool to index such a TM so that matches can be served from it? How long does it take your favourite CAT tool to do it?


Here's some actual numbers:

size 270MB (just DE SL part)
loading time about 3 seconds on 2.0x OmegaT

Note that Acquis material includes all the current languages (a nice XLST script anyone;), so, yes, it is huge. However, if you go for a single pair, it may still be huge ... But less huge (g).



[Edited at 2009-10-12 18:51 GMT]


 

Samuel Murray  Identity Verified
Netherlands
Local time: 18:11
Member (2006)
English to Afrikaans
+ ...
Just three seconds? Oct 12, 2009

VitoSmolej wrote:
Here's some actual numbers:
Size: 270MB (just DE SL part)
Loading time: about 3 seconds on 2.0x OmegaT


Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)???


 

Vito Smolej
Germany
Local time: 18:11
Member (2004)
English to Slovenian
+ ...
TOPIC STARTER
Stand by for further news ... Oct 13, 2009

Samuel Murray wrote:Just three seconds and it will give a fuzzy match from a segment anywhere in the TM (including, say, the rear)???

I'll do some more tests and report.

Quoting how long it takes to load, says of course nothing about how fast it matches. But I think it would a laugh of the year, if the access time would scale anything but logarithmically with TM size. The users would have noticed this some time ago.

Of course what I think may not match the reality. So let's do some tests.

Regards

Vito


 

Susan Welsh  Identity Verified
United States
Local time: 12:11
Member (2008)
Russian to English
+ ...
OmegaT-tokenizers 0.2-2.0 released Oct 14, 2009

OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching.

The following new tokenizers are available:
Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian.

OmegaT-tokenizers is available from
...
See more
OmegaT-tokenizers has been updated to include Lucene 2.9.0. This is the feature that enables glossary "stemming" (to find inflections of words) and "stop-word" to eliminate little words like "and" and "the" from TM fuzzy matching.

The following new tokenizers are available:
Arabic, Persian, SmartChinese, Turkish, Hungarian and Romanian.

OmegaT-tokenizers is available from
https://sourceforge.net/projects/omegat-plugins/

(This just in from Didier.)
Collapse


 

Hakan Kiyici  Identity Verified
Turkey
Local time: 19:11
Member (2009)
English to Turkish
+ ...
disappointed again Nov 24, 2010

I had installed OmegaT earlier. It did not work properly. I had given up.

Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times.


 

Didier Briel  Identity Verified
France
Local time: 18:11
English to French
+ ...
It is not a normal behaviour Nov 24, 2010

Hakan Kiyici wrote:

I had installed OmegaT earlier. It did not work properly. I had given up.

Reading some articles of SubRip file types, OmegaT was advised. I installed the latest version. It is working at 50% CPU and gets stuck. Incredibly slow at times.

It is not a normal behaviour.

What is your operating system?

What version of OmegaT did you install?

Didier


 


There is no moderator assigned specifically to this forum.
To report site rules violations or get help, please contact site staff »


OmegaT 2.0 released

Advanced search






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running and helps experienced users make the most of the powerful features.

More info »