TM size and structure in Trados & memoQ
Thread poster: Dominique Pivard

Dominique Pivard  Identity Verified
Local time: 14:55
Finnish to French
Mar 1, 2013

I plan to make a video about concordance search in Trados (Workbench + Studio), which is why I asked how people use the feature in a recent post to the Trados forum. In relation to that forthcoming video, I just published an introductory video that shows how TM's are structured in Workbench, Studio and memoQ, and how much they "weigh" in each tool:

http://wordfast.fi/blog/cat-tools/2013/02/28/tms-under-the-hood-trados-vs-memoq/
or
http://youtu.be/m48WwhtF2F0?hd=1

The same TMX (about 70,000 units) that takes 48 MB results in a 31 MB TM (5 files) in Workbench 8.3, a 83-139 MB TM (1 file) in Studio 2011 and a 303 MB TM (20 files) in memoQ. Quite some differences! The importing process was also significantly slower in memoQ, though I didn't measure the exact time it took.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:55
Member (2006)
English to Afrikaans
+ ...
And in WFP? Mar 1, 2013

Dominique Pivard wrote:
In relation to that forthcoming video, I just published an introductory video that shows how TM's are structured in Workbench, Studio and memoQ, and how much they "weigh" in each tool...


I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing (just created an empty TM with a header and an associated JTX and lockfile). Does anyone know what the hidden setting in WFP is to let it know that although I appreciate all the blinkenlichten, that I actually want the TUs?

By the way, WFP complained during the import that some TUs had line breaks and tabs in them -- is this not allowed in TMX, or is the warning more for WFP users' benefit?


[Edited at 2013-03-01 07:22 GMT]


 

Dominique Pivard  Identity Verified
Local time: 14:55
Finnish to French
TOPIC STARTER
Use WfConverter Mar 1, 2013

Samuel Murray wrote:
I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing (just created an empty TM with a header and an associated JTX and lockfile). Does anyone know what the hidden setting in WFP is to let it know that although I appreciate all the blinkenlichten, that I actually want the TUs?

By the way, WFP complained during the import that some TUs had line breaks and tabs in them -- is this not allowed in TMX, or is the warning more for WFP users' benefit?

Yes, I noticed WFP had problems with these particular TMX. It's kind of worrying, because they come from Studio and are therefore a commonly encountered type.

My suggestion would be to use the standalone WfConverter mentioned in this blog post.

I was also able to import the TMX obtained by export from memoQ. I don't know why WFP doesn't import the TMX that comes from Studio. Maybe Kristyna can investigate this with the WFP developers.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 13:55
Member (2006)
English to Afrikaans
+ ...
Some more comments on WFP Mar 1, 2013

Dominique Pivard wrote:
Samuel Murray wrote:
I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing...

Yes, I noticed WFP had problems with these particular TMX. It's kind of worrying, because they come from Studio and are therefore a commonly encountered type.


I managed to coax WFP into importing the TM. I created a new project with the same language codes as the TMX file, and then it worked. It did not occur to me that it would be a problem if there was any mismatch between the languages of the project and the languages of the TM that I was trying to add.

The file sizes were:

original tmx = 48 MB

JTX file = 3 MB
LOCK file = 0 MB
TXT file = 44 MB

I could not figure out how to do a concordance search in WFP without opening a file in it (I'm telling you, this is a weird program). So I created a fake Finnish file using some of the segments in the middle of the TM, and tested the concordance search (the thing called "TM Search", right?).

A search for "urvallisuuspoliittisii" yielded no results (and I was told within 2 seconds). A search for "turvallisuuspoliittisii" got me 2 results (and there are only 2 of them in the TM) within about 1.5 seconds. A search for "asvihuonepäästöje" got me no results (and I was told within 2 seconds) but a search for "kasvihuonepäästöje" yielded 7 results, which were displayed all at once after about 3 seconds. A search for "tarjonnan+trendit" was just as fast as a search for "tarjonnan+trendi" (about 3 seconds). A search for "tarjonna+trendi" took slightly longer (about 4 seconds). A search for "energiajärjestö IEA:n mukaan tienhaarassa, kun kulutuksen ja tarjonnan trendit ovat kestämättömiä niin ekologisesti," took 8 seconds and found the 1 match.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TM size and structure in Trados & memoQ

Advanced search







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search