Giant(?) Multiterm Databases: Are they feasible?
Thread poster: Haluk Levent Aka
Haluk Levent Aka
Local time: 11:02
Japanese to Turkish
+ ...
Aug 8, 2004

I'm thinking about migrating most of my printed dictionaries/glossaries in to multiterm format. When done each termbase will contain between 5000 - 15000 terms.

Are such large termbases practical? I'm afraid workign with such large termbases may slow down translation process (lag between closing/opening segments)?

Does anyone have any experience with working with/using termbases of such size? Also, can anyone tell me what would the actual size (in Mb or Gb) of a termbase with say 1000 terms?

Thanks & Regards,
Hal


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
They are feasable Aug 8, 2004

I transformed all of the MS glossaries into one Multiterm database, using a blindingly fast Perl program for the conversion and MS Access to refine it (the source contains a lot of garbage).

This MultiTerm database contains 217058 entries, has 148 MB (zipped: 14 MB) and does not really slow down the translation, because the search in the TM works independently of the MultiTerm search, which takes just one or two seconds longer than the search in the TM.

I hate to admit that this is much faster than the Déjà Vu search in the same transformed MS glossary, which takes too long to let Déjà Vu insert the terms into the translation, but it is still fast enough if you disable the automatic insertion and use it like MultiTerm. This way it takes 3-6 seconds which gives you a good chance that you can see the hits before you will have finished the segment...

Maybe it would work faster if I would transform it into a Déjà Vu Lexicon instead of a Terminology Database. I did not test this yet, because I think I would have to refine the database much more before it would be suitable for a Lexicon.

Good speed,
Harry

[Edited at 2004-08-08 23:51]


Direct link Reply with quote
 

Victor Sidelnikov  Identity Verified
Russian Federation
Local time: 11:02
Member (2004)
English to Russian
+ ...
No problems Aug 9, 2004

I have some termbases with 130,000 terms. Volume - 60-70 MB. No slowdown, no any failures.
Broadly speaking 5000-15000 term - this is usial size of dictionary, you can't to observe search in a such termbase.


Direct link Reply with quote
 
Haluk Levent Aka
Local time: 11:02
Japanese to Turkish
+ ...
TOPIC STARTER
Thanks for all replies Aug 11, 2004

Thank you for your replies and comments. I'm much relieved that termbases with 5 - 15 thousand entries will not cause delay.

Regards,
Haluk


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Giant(?) Multiterm Databases: Are they feasible?

Advanced search







memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »
LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search