How large is the biggest termbase Multiterm can handle?
Thread poster: FarkasAndras
FarkasAndras
Local time: 04:19
English to Hungarian
+ ...
Apr 28, 2008

Hi everyone!

wI'm thinking about throwing large databases at Multiterm 7. You ask Why? Why not? This is just to see if it's even feasible before I get at it. Would it choke or would it gulp them down like candy? (The reason being that I'm too lazy to look up terms in different sources and might be able to get some good resources in the reight format. I quickly get bored of ALT+TAB and copy/paste.)

For example, a dictionary with 250 000 word/expression pairs (I have it already, could easily convert it into importable format).
Perhaps all the different glossaries I can get my hands on , lumped into one (say, 50 000 entries)
Or something even more crazy, who knows?

Does anyone have experience with MT and large termbases? Does it refuse to work? Crash? Eat up RAM? Search slowly?

What's the biggest you've managed to make MT work with? BTW my computer has a quick processor and 2GB of RAM so system resources are reasonably solid.

Thanks for any help.


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 04:19
Member (2004)
English to Slovenian
+ ...
One way to find out Apr 28, 2008

What's the biggest you've managed to make MT work with? BTW my computer has a quick processor and 2GB of RAM so system resources are reasonably solid.


1. Start with an empty Termbase
2. Import a file (*)
3. repeat 2
- until
i. you fall asleep
ii. the PC/Jet engine burps something about "no more space available for ..."
iii. the hell freezes over
iv. your wife files for a divorce
v. (add your own terminating condition)

I would assume the resulting (biggest possible) TermBase "will fill out the disk space alloted to it" - which would to the best of my knowledge correspond to the free space on the C drive (nota bene: ... where the TermBase eventually resides, will have no influence on it!..). Get yourself some warm underwear, it will be a long long vigil (the PC starts caching of course some time down the road).


*: the file size selected depends on the expected accuracy of the result an on your patience. Length 1 would be most accurate but the slowest and lets say 10.000 records would be fast enough, even if eventually accurate to five, four, three digits only.

[Edited at 2008-04-28 13:03]


Direct link Reply with quote
 
FarkasAndras
Local time: 04:19
English to Hungarian
+ ...
TOPIC STARTER
not so sure Apr 28, 2008

Well, Workbench has serious trouble opening large translation memories (Eu corpus)even when system resources are easily sufficient.
So there could be some similar limitation hardwired into multiterm.
We'll se if someone has experience with it.


Direct link Reply with quote
 
Daniel García
English to Spanish
+ ...
How many TUs? Apr 28, 2008

FarkasAndras wrote:

Well, Workbench has serious trouble opening large translation memories (Eu corpus)even when system resources are easily sufficient.


The EN-HU EU TM has about 1,5 million translation units, doesn't it?

It would be interesting to see how different CAT tools manage that size of TMs...

As for MultiTerm, I guess that the limit is set by the database back end that you are using (be it Jet engine, or SQL Server or Oracle).

Daniel


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 04:19
French to Polish
+ ...
MS Jet DB size is 2 GB... Apr 28, 2008

Vito Smolej wrote:

What's the biggest you've managed to make MT work with? BTW my computer has a quick processor and 2GB of RAM so system resources are reasonably solid.


1. Start with an empty Termbase
2. Import a file (*)

Don't use large MT5 files.
The hell may freeze.

3. repeat 2
- until
i. you fall asleep
ii. the PC/Jet engine burps something about "no more space available for ..."


[...]

I would assume the resulting (biggest possible) TermBase "will fill out the disk space alloted to it" - which would to the best of my knowledge correspond to the free space on the C drive

Wrong.
The size of this version of MS Jet DB is limited to 2 GB.
IMHO, in theory, some hundreds thousands text entries but the performance may be... ehm... rather poor...
IMHO it may depend heavily of the entry structure.

Normally, I use 20-30 kwords MT termbases.

In DVX (based on MS Jet engine too), I used termbases up to 1 000 000 entries but it needed a special optimization impossible in MultiTerm (see below).

(nota bene: ... where the TermBase eventually resides, will have no influence on it!..).

No comments.

[...]

Cheers
GG


Direct link Reply with quote
 
FarkasAndras
Local time: 04:19
English to Hungarian
+ ...
TOPIC STARTER
wandering on Apr 28, 2008

dgmaga wrote:

The EN-HU EU TM has about 1,5 million translation units, doesn't it?

It would be interesting to see how different CAT tools manage that size of TMs...

About 880 000, IIRC. Lots anyway:)


dgmaga wrote:

As for MultiTerm, I guess that the limit is set by the database back end that you are using (be it Jet engine, or SQL Server or Oracle).


I never knew there were different options. No idea what mine is using. Just out of curiosity, how would I find out and does it matter much?


Direct link Reply with quote
 

Grzegorz Gryc  Identity Verified
Local time: 04:19
French to Polish
+ ...
Other engines... Apr 28, 2008

FarkasAndras wrote:

dgmaga wrote:

The EN-HU EU TM has about 1,5 million translation units, doesn't it?

It would be interesting to see how different CAT tools manage that size of TMs...

About 880 000, IIRC. Lots anyway:)


I use some TM's like this as Concordance in Trados and DVX.
No serious problems although TMs are 3-4 times bigger than Workbench TM's.


dgmaga wrote:

As for MultiTerm, I guess that the limit is set by the database back end that you are using (be it Jet engine, or SQL Server or Oracle).


I never knew there were different options. No idea what mine is using. Just out of curiosity, how would I find out and does it matter much?


MS SQL and Oracle are available only for MultiTerm Server.
As SDL considers a standard translator don't need advanced, performant and speed optimized technologies, so the MT Desktop is provided only with the slowest Jet engine.

BTW.
If you want to use a scalable, performant SQL engine, take a look on Swordfish.
It may use Oracle free edition (or MySQL).

Cheers
GG

[Edited at 2008-04-28 14:33]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How large is the biggest termbase Multiterm can handle?

Advanced search







CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search