TM size and structure in Trados & memoQ
Thread poster: Dominique Pivard

Dominique Pivard  Identity Verified
Local time: 16:53
Finnish to French
Mar 1, 2013

I plan to make a video about concordance search in Trados (Workbench + Studio), which is why I asked how people use the feature in a recent post to the Trados forum. In relation to that forthcoming video, I just published an introductory video that shows how TM's are structured in Workbench, Studio and memoQ, and how much they "weigh" in each tool:

http://wordfast.fi/blog/cat-tools/2013/02/28/tms-under-the-hood-trados-vs-memoq/
or
http://youtu.be/m48WwhtF2F0?hd=1

The same TMX (about 70,000 units) that takes 48 MB results in a 31 MB TM (5 files) in Workbench 8.3, a 83-139 MB TM (1 file) in Studio 2011 and a 303 MB TM (20 files) in memoQ. Quite some differences! The importing process was also significantly slower in memoQ, though I didn't measure the exact time it took.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 15:53
Member (2006)
English to Afrikaans
+ ...
And in WFP? Mar 1, 2013

Dominique Pivard wrote:
In relation to that forthcoming video, I just published an introductory video that shows how TM's are structured in Workbench, Studio and memoQ, and how much they "weigh" in each tool...


I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing (just created an empty TM with a header and an associated JTX and lockfile). Does anyone know what the hidden setting in WFP is to let it know that although I appreciate all the blinkenlichten, that I actually want the TUs?

By the way, WFP complained during the import that some TUs had line breaks and tabs in them -- is this not allowed in TMX, or is the warning more for WFP users' benefit?


[Edited at 2013-03-01 07:22 GMT]


 

Dominique Pivard  Identity Verified
Local time: 16:53
Finnish to French
TOPIC STARTER
Use WfConverter Mar 1, 2013

Samuel Murray wrote:
I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing (just created an empty TM with a header and an associated JTX and lockfile). Does anyone know what the hidden setting in WFP is to let it know that although I appreciate all the blinkenlichten, that I actually want the TUs?

By the way, WFP complained during the import that some TUs had line breaks and tabs in them -- is this not allowed in TMX, or is the warning more for WFP users' benefit?

Yes, I noticed WFP had problems with these particular TMX. It's kind of worrying, because they come from Studio and are therefore a commonly encountered type.

My suggestion would be to use the standalone WfConverter mentioned in this blog post.

I was also able to import the TMX obtained by export from memoQ. I don't know why WFP doesn't import the TMX that comes from Studio. Maybe Kristyna can investigate this with the WFP developers.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 15:53
Member (2006)
English to Afrikaans
+ ...
Some more comments on WFP Mar 1, 2013

Dominique Pivard wrote:
Samuel Murray wrote:
I tried importing that TM into WFP and although it recognised about 68000 TUs, it imported nothing...

Yes, I noticed WFP had problems with these particular TMX. It's kind of worrying, because they come from Studio and are therefore a commonly encountered type.


I managed to coax WFP into importing the TM. I created a new project with the same language codes as the TMX file, and then it worked. It did not occur to me that it would be a problem if there was any mismatch between the languages of the project and the languages of the TM that I was trying to add.

The file sizes were:

original tmx = 48 MB

JTX file = 3 MB
LOCK file = 0 MB
TXT file = 44 MB

I could not figure out how to do a concordance search in WFP without opening a file in it (I'm telling you, this is a weird program). So I created a fake Finnish file using some of the segments in the middle of the TM, and tested the concordance search (the thing called "TM Search", right?).

A search for "urvallisuuspoliittisii" yielded no results (and I was told within 2 seconds). A search for "turvallisuuspoliittisii" got me 2 results (and there are only 2 of them in the TM) within about 1.5 seconds. A search for "asvihuonepäästöje" got me no results (and I was told within 2 seconds) but a search for "kasvihuonepäästöje" yielded 7 results, which were displayed all at once after about 3 seconds. A search for "tarjonnan+trendit" was just as fast as a search for "tarjonnan+trendi" (about 3 seconds). A search for "tarjonna+trendi" took slightly longer (about 4 seconds). A search for "energiajärjestö IEA:n mukaan tienhaarassa, kun kulutuksen ja tarjonnan trendit ovat kestämättömiä niin ekologisesti," took 8 seconds and found the 1 match.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

TM size and structure in Trados & memoQ

Advanced search







WordFinder Unlimited
For clarity and excellence

WordFinder is the leading dictionary service that gives you the words you want anywhere, anytime. Access 260+ dictionaries from the world's leading dictionary publishers in virtually any device. Find the right word anywhere, anytime - online or offline.

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search