Pages in topic:   [1 2] >
affordable TM server solution
Thread poster: Lutz Molderings

Lutz Molderings  Identity Verified
Germany
Local time: 08:54
Member (2007)
German to English
+ ...
Oct 17, 2011

Over the years, I have acquired a rather extensive translation memory, currently containing over a million segments. Obviously, as a single TM this data is of little use to me. No CAT tool can handle this amount of data - the TM is over 17 GB in size, which is obviously far too large for file-based processing. As a result, I am limited to using customer or topic-specific TMs. Not ideal.

What I really need is a server solution, but all the products offered by the major players are for corporate users and not affordable for an individual. At least that's what I think.
Maybe somebody knows of an affordable solution I haven't heard of. That would be awesome.


Direct link Reply with quote
 

Boris Matveev
Russian Federation
Local time: 10:54
English to Russian
to clean a bit? Oct 17, 2011

I clean my general TM from time to time: not all the pairs in TM are useful for universal use. The cleaning helps with volume problem...

Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 08:54
Member (2006)
English to Afrikaans
+ ...
Wordfast Server Oct 17, 2011

Lutz Molderings wrote:
What I really need is a server solution, but all the products offered by the major players are for corporate users and not affordable for an individual.


I only know of Wordfast Server, but I have no idea what it costs (perhaps you do?). The downsides of WFS is that (a) you can only use it along with WFC or WFP and (b) you need to run it on a separate computer on your LAN.

WFC used to work with a free server that could handle 5 million TUs that was installed on the same computer, but AFAIK that is no longer distributed officially (and you had to use WFC to use it).


Direct link Reply with quote
 

Stanislaw Czech, MCIL  Identity Verified
United Kingdom
Local time: 07:54
Member (2006)
English to Polish
+ ...
Maybe you could try to streamline this TM a bit Oct 17, 2011

Unfortunal AFAIK server solutions and affordability don't go hand in hand. Even if you found a cheap software, you would still need a powerful server capable of processing 17GB database.

I am using in Studio DGT's TM with some 796438 units and with a fast HD and good amount of RAM it is quite usable, however is has a bit less than 1GB. Maybe your TM contains some extra information which are not necessary?

Alternatively you could think about upgrading your hardware instead of buying a server (physical server which would run the software). I guess that if you bought a state of the art PC with 64bit windows (it allows you to use more than 3GB of RAM), dedicated fast SSD drive for the operating system and your TM, fast processor and some 12-16GB of RAM it would be possible to use your 17GB TM with your current CAT.

Cheers
S


Direct link Reply with quote
 

Soonthon LUPKITARO(Ph.D.)  Identity Verified
Thailand
Local time: 14:54
Member (2004)
English to Thai
+ ...
TM filtering? Oct 17, 2011

Just only an idea.

My Big TM is useless for all job types and I filter, split into many smaller TMs e.g. for marketing, legal, technical jobs, respectively.

Soonthon Lupkitaro


Direct link Reply with quote
 

Hermann Bruns  Identity Verified
Local time: 08:54
English to German
MetaTexis Server Oct 17, 2011

Hello Lutz,

you might want to try the MetaTexis Server. It can be rented or purchased, starting at a few hundres EUROs.

Just send a request for a trial license key and for concrete offer to support@metatexis.com.

Note that if you want to use the MetaTexis Server, you will also have to use MetaTexis for Word. A free trial version is available at www.metatexis.com.

However, if there is only you who is going to use the Server, there will actually be no need to purchase a server. In this case, upgrading your hardware and tweaking the search options is the only thing that helps (or re-organizing the TM into sub-TMs per subject, per customer, or per language - as appropriate). A server solution cannot work faster than a server-client solution, if the server hardware is not more powerful than the client hardware.

Best regards
Hermann

[Edited at 2011-10-17 08:03 GMT]


Direct link Reply with quote
 

Lutz Molderings  Identity Verified
Germany
Local time: 08:54
Member (2007)
German to English
+ ...
TOPIC STARTER
why is my TM so large? Oct 17, 2011

Thanks both of you.

I am using in Studio DGT's TM with some 796438 units and with a fast HD and good amount of RAM it is quite usable, however is has a bit less than 1GB. Maybe your TM contains some extra information which are not necessary?


That's odd: my sdltm contains a bit over 1 million units and is over 17 GB in size, your sdltm contains almost 800,000 units but is just over 1 GB in size.

I have character-based concordance search enabled, but I wouldn't have thought that makes such a big difference in terms of size.

As for hardware, I don't think that can be the problem. I just bought a top-of-the-line ThinkStation S20 with 12 GB RAM.

[Edited at 2011-10-17 08:08 GMT]


Direct link Reply with quote
 

Jabberwock  Identity Verified
Poland
Local time: 08:54
Member (2004)
English to Polish
Performance tricks Oct 17, 2011

If you want to improve performance, you could look into SSD disks working in a RAID matrix. I am not sure, though, if it will be cheaper than a server solution

An example could be seen here:

http://www.youtube.com/watch?v=mKcSxd_ynsM

However, check with someone more knowledgeable if it will provide notable performance improvements in your case. As far as I know, the mileage might vary even between different applications (it depends how they use the disk space).


Direct link Reply with quote
 

Piotr Bienkowski  Identity Verified
Poland
Local time: 08:54
Member (2005)
English to Polish
+ ...
One of the available options (actually two).... Oct 17, 2011

Lutz Molderings wrote:

Over the years, I have acquired a rather extensive translation memory, currently containing over a million segments. (...)

What I really need is a server solution, but all the products offered by the major players are for corporate users and not affordable for an individual. At least that's what I think.
Maybe somebody knows of an affordable solution I haven't heard of. That would be awesome.


are here: http://www.maxprograms.com/products/remotetm.html

Regards,

Piotr Bieńkowski

[Edited at 2011-10-17 10:40 GMT]


Direct link Reply with quote
 
FarkasAndras
Local time: 08:54
English to Hungarian
+ ...
17 GB? Oct 17, 2011

Lutz Molderings wrote:

Thanks both of you.

I am using in Studio DGT's TM with some 796438 units and with a fast HD and good amount of RAM it is quite usable, however is has a bit less than 1GB. Maybe your TM contains some extra information which are not necessary?


That's odd: my sdltm contains a bit over 1 million units and is over 17 GB in size, your sdltm contains almost 800,000 units but is just over 1 GB in size.

I have character-based concordance search enabled, but I wouldn't have thought that makes such a big difference in terms of size.

As for hardware, I don't think that can be the problem. I just bought a top-of-the-line ThinkStation S20 with 12 GB RAM.

That is very odd indeed. Your 17 GB TM size is definitely anomalous.
I have various large TMs (DGT-TM and others), some stats:
400,000 TUs - 500 MB
860,000 TUs - 1.0 GB
440,000 TUs - 640 MB

Based on this, your TM should definitely be under 1.5 GB. I'd double check the size, and then if it's indeed 17GB, try to fix it. Export it, create a new TM and reimport. Perhaps Trados has a "reorganize" feature. You could try that - after backing up your ginormous original.

[Edited at 2011-10-17 11:21 GMT]


Direct link Reply with quote
 

Lutz Molderings  Identity Verified
Germany
Local time: 08:54
Member (2007)
German to English
+ ...
TOPIC STARTER
10 million, not 1 million Oct 17, 2011

Ok, just realised it's 10 million units, not 1 million.
I guess that really is too much.



[Edited at 2011-10-17 12:00 GMT]


Direct link Reply with quote
 
FarkasAndras
Local time: 08:54
English to Hungarian
+ ...
hardware Oct 17, 2011

Lutz Molderings wrote:

Ok, just realised it's 10 million units, not 1 million.
I guess that really is too much.


Probably. Also, I don't think a server solution would help at all. That would just mean that the TM is stored on a different machine and hits are served up via IP. Lookups will probably use the same tecnology. If the server hardware is not drastically more powerful than your computer, I don't see any reason for improvement. Perhaps a CAT with more robust lookup tech would help (relational db? Across?). But I wouldn't bother with that. Get a last-gen SSD drive if you haven't yet. A 256 gig drive with sandforce 15xx or newer will deliver 300+ MB/s seqential reads and crazy random reads. If that doesn't do the trick with the obscene amount of ram you already have, it's time to give up.


Direct link Reply with quote
 

Lutz Molderings  Identity Verified
Germany
Local time: 08:54
Member (2007)
German to English
+ ...
TOPIC STARTER
interesting Oct 17, 2011

Jabberwock wrote:

If you want to improve performance, you could look into SSD disks working in a RAID matrix. I am not sure, though, if it will be cheaper than a server solution

An example could be seen here:

http://www.youtube.com/watch?v=mKcSxd_ynsM

However, check with someone more knowledgeable if it will provide notable performance improvements in your case. As far as I know, the mileage might vary even between different applications (it depends how they use the disk space).


Thanks Jabberwock! This is something I am going to take a closer look at.


Direct link Reply with quote
 

Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 09:54
Member (2008)
English to Russian
+ ...
Size... Oct 17, 2011

The average size of a segment is a little over 1 kB. Mainly due to many XML tags (most of them being empty).

Lutz Molderings wrote: 10 million, not 1 million

I never use ONE BIG TM FOR ALL. I have many specialized TMs and combine them to suit the job requirements. While translating, they are all read-only. The current segments are saved in a temporary (project's) TM. After the job is done the temporary TM is use to update one of the archive TMs -- the one which is topically closer.

So, my TMs never grow beyond 1GB each. And one does not really need all the TMs at once.
(Why do I need "legal" or "agricultural" TMs when I translate something automotive?)

[Редактировалось 2011-10-17 13:28 GMT]


Direct link Reply with quote
 

Lutz Molderings  Identity Verified
Germany
Local time: 08:54
Member (2007)
German to English
+ ...
TOPIC STARTER
I do the same Oct 17, 2011

Sergei Leshchinsky wrote:

I never use ONE BIG TM FOR ALL. I have many specialized TMs and combine them to suit the job requirements. While translating, they are all read-only. The current segments are saved in a temporary (project's) TM. After the job is done the temporary TM is use to update one of the archive TMs -- the one which is topically closer.

So, my TMs never grow beyond 1GB each. An one does not really need all the TM at once.
(Why do I need "legal" or "agricultural" TMs when I translate something automotive?)


[Редактировалось 2011-10-17 13:18 GMT]


I basically do the same, because I have no other option.
On any translation I work, I have a project TM, a customer TM and possibly a subject-specific TM.

Problem is you will rarely work on a translation that is only dealing with one specific subject matter. If you're working on a manual, there might be some legal content at the end, if you're working on marketing text, it might include medical content.

This is why I also have one "global" TM that contains all my bilingual data. Ideally, I'd like to have it running parallel to my other TMs, but it's simply too large.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

affordable TM server solution

Advanced search







LSP.expert
You’re a freelance translator? LSP.expert helps you manage your daily translation jobs. It’s easy, fast and secure.

How about you start tracking translation jobs and sending invoices in minutes? You can also manage your clients and generate reports about your business activities. So you always keep a clear view on your planning, AND you get a free 30 day trial period!

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search