Giant(?) Multiterm Databases: Are they feasible?
Thread poster: Haluk Levent Aka

Haluk Levent Aka
Local time: 11:15
Japanese to Turkish
+ ...
Aug 8, 2004

I'm thinking about migrating most of my printed dictionaries/glossaries in to multiterm format. When done each termbase will contain between 5000 - 15000 terms.

Are such large termbases practical? I'm afraid workign with such large termbases may slow down translation process (lag between closing/opening segments)?

Does anyone have any experience with working with/using termbases of such size? Also, can anyone tell me what would the actual size (in Mb or Gb) of a termbase with say 1000 terms?

Thanks & Regards,
Hal


Direct link Reply with quote
 

Harry Bornemann  Identity Verified
Mexico
English to German
+ ...
They are feasable Aug 8, 2004

I transformed all of the MS glossaries into one Multiterm database, using a blindingly fast Perl program for the conversion and MS Access to refine it (the source contains a lot of garbage).

This MultiTerm database contains 217058 entries, has 148 MB (zipped: 14 MB) and does not really slow down the translation, because the search in the TM works independently of the MultiTerm search, which takes just one or two seconds longer than the search in the TM.

I hate to admit that this is much faster than the Déjà Vu search in the same transformed MS glossary, which takes too long to let Déjà Vu insert the terms into the translation, but it is still fast enough if you disable the automatic insertion and use it like MultiTerm. This way it takes 3-6 seconds which gives you a good chance that you can see the hits before you will have finished the segment...

Maybe it would work faster if I would transform it into a Déjà Vu Lexicon instead of a Terminology Database. I did not test this yet, because I think I would have to refine the database much more before it would be suitable for a Lexicon.

Good speed,
Harry

[Edited at 2004-08-08 23:51]


Direct link Reply with quote
 

Victor Sidelnikov  Identity Verified
Russian Federation
Local time: 11:15
Member (2004)
English to Russian
+ ...
No problems Aug 9, 2004

I have some termbases with 130,000 terms. Volume - 60-70 MB. No slowdown, no any failures.
Broadly speaking 5000-15000 term - this is usial size of dictionary, you can't to observe search in a such termbase.


Direct link Reply with quote
 

Haluk Levent Aka
Local time: 11:15
Japanese to Turkish
+ ...
TOPIC STARTER
Thanks for all replies Aug 11, 2004

Thank you for your replies and comments. I'm much relieved that termbases with 5 - 15 thousand entries will not cause delay.

Regards,
Haluk


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Giant(?) Multiterm Databases: Are they feasible?

Advanced search







SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »
memoQ translator pro
Kilgray's memoQ is the world's fastest developing integrated localization & translation environment rendering you more productive and efficient.

With our advanced file filters, unlimited language and advanced file support, memoQ translator pro has been designed for translators and reviewers who work on their own, with other translators or in team-based translation projects.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search