Baffling difference between memoQ and SDL Trados Studio word counts
Thread poster: Huw Watkins

Huw Watkins  Identity Verified
United Kingdom
Local time: 22:45
Member (2005)
Italian to English
+ ...
Mar 20, 2015

Hi Guys,

I could really do with some help on this as I have run out of ideas.

I have been sent a project that was prepared in memoQ. The PM has also sent me exports of all the TMs used to prepare the project on their end along with the mqxliff files. In preparing these files the PM has locked a good number of segments which, I assume, correspond to 100% TM matches/Reps and/or numbers.

The problem I am having is that the word count analysis they have provided varies hugely from mine. This is their word count for the 5 files:

Capture_zpsvzqtcxy5.jpg

This is my analysis from SDL Trados Studio:

Capture1_zpsjgsrvdmx.jpg

As you can see I have ticked the following options in studio in an effort to match the memoQ analysis as much as possible:

Capture2_zpso5ql1g7u.jpg

The main differences are in the total and no match word counts. Is it possible that despite reporting locked segments as a separate category in Studio, those words are still included in the no matches (and therefore total word count), whereas memoQ simply does not count the locked segments at all?

Can anyone shed some light on this? It would be greatly appreciated.

Huw


 

Laura Harrison  Identity Verified
United Kingdom
Local time: 22:45
French to English
+ ...
Memo Q has a separate tick box to include locked rows... Mar 20, 2015

so I think your assumption regarding the locked rows is highly likely. All the more so if you take the locked words from the total you come to a more expected difference in word count between the two.

And as Studio gives you the option to include locked segments separately, this would seem to imply that they would always be included anyway.

Laura


 

Huw Watkins  Identity Verified
United Kingdom
Local time: 22:45
Member (2005)
Italian to English
+ ...
TOPIC STARTER
If this is the case... Mar 20, 2015

Laura Harrison wrote:

Memo Q has a separate tick box to include locked rows...




...then I suspect I have gotten to the bottom of the issue. I rarely use memoQ and really do not know the ins and outs of its word count analysis.

I just want to be sure I am not translating twice as many words as I am getting paid for!!!

That said, it seems very strange that Studio would count locked segments as no match words. Perhaps Paul from SDL could shed some light on this?

Thanks Laura.

[Edited at 2015-03-20 15:19 GMT]


 

Chunyi Chen
United States
Local time: 14:45
English to Chinese
I would stay away from this client Mar 20, 2015

Hi Huw,

If you look at the TM resources the MemoQ project was created based on, you will see they used Every TM and corpus, which I believe is some kind of online TM with unvalidated translated segments (I stand corrected, since I never really used it). They also activated "homogeneity" in their analysis, which is "internal fuzzy matches", again another way to rip off translators. With all these tricks to save money on their end, this company is not doing business ethically or fairly. What's worse, I think the translation product is doomed to fail, with all these dubious 100% matches and locked segments. It's probably better to turn down this project and stay away from this client.


 

Huw Watkins  Identity Verified
United Kingdom
Local time: 22:45
Member (2005)
Italian to English
+ ...
TOPIC STARTER
Not sure that I agree Mar 20, 2015

Chunyi Chen wrote:

I would stay away from this client.


This is actually a trusted client whom I have worked for on several occasions. They actually use memoQ as their default tool and give translators access to it via a server connection. They accommodate my requests to use my preferred tool and I don't really feel that are trying to short change me. I am also quite sure they bill their clients using the very same memoQ analysis.

The problem is less to do with the client, in my opinion, and more to do with the inner workings of the CAT tools. I remain a little unclear on if and why SDL is picking up locked segments as new matches, where the client has now confirmed that memo doesn't count them at all.

Huw


 

Chunyi Chen
United States
Local time: 14:45
English to Chinese
Homogeneity should not be activated when doing the analysis Mar 20, 2015

Hi Huw,

They are your client, so you know better:P
For me, "every TM and corpus" and "homogeneity" sent some warnings. I may have been wrong about "every TM" by saying it's an online machine translation TM product, but I don't think they should use the "homogeneity" feature when running the analysis. It would boost the fuzzy match numbers and you end up getting paid less, if you offer fuzzy match discount to the client.

Chunyi

Huw Watkins wrote:

Chunyi Chen wrote:

I would stay away from this client.


This is actually a trusted client whom I have worked for on several occasions. They actually use memoQ as their default tool and give translators access to it via a server connection. They accommodate my requests to use my preferred tool and I don't really feel that are trying to short change me. I am also quite sure they bill their clients using the very same memoQ analysis.

The problem is less to do with the client, in my opinion, and more to do with the inner workings of the CAT tools. I remain a little unclear on if and why SDL is picking up locked segments as new matches, where the client has now confirmed that memo doesn't count them at all.

Huw


 

Huw Watkins  Identity Verified
United Kingdom
Local time: 22:45
Member (2005)
Italian to English
+ ...
TOPIC STARTER
Don't think it's machine translation Mar 23, 2015

Chunyi Chen wrote:

Hi Huw,

They are your client, so you know better:P
For me, "every TM and corpus" and "homogeneity" sent some warnings. I may have been wrong about "every TM" by saying it's an online machine translation TM product, but I don't think they should use the "homogeneity" feature when running the analysis. It would boost the fuzzy match numbers and you end up getting paid less, if you offer fuzzy match discount to the client.

Chunyi



No in this case, the client has sent me exports of all 3 TMs they used in the analysis. As for Corpus, I think this is memoQ speak, I have no idea what it is. I'm pretty sure it's not automated translation though as they require translators not to post entire sentences into any online machine translation systems in their POs for the sake of client and business secrecy. I'm sure someone more memoQ savvy would know what corpus is.

The point is that my analysis more or less matches theirs if I exclude the locked sentences (which I won't be translating for obvious reasons). I am just curious as to why SDL counts locked segments as no matches, which I now strongly feel is what is happening here.



[Edited at 2015-03-23 11:13 GMT]


 

Dominique Pivard  Identity Verified
Local time: 00:45
Finnish to French
memoQ corpora (LiveDocs) Mar 23, 2015

Huw Watkins wrote:
As for Corpus, I think this is memoQ speak, I have no idea what it is. I'm pretty sure it's not automated translation though as they require translators not to post entire sentences into any online machine translation systems in their POs for the sake of client and business secrecy. I'm sure someone more memoQ savvy would know what corpus is.

A memoQ corpus (or corpora, if there are several of them) is the LiveDocs feature:

http://kilgray.com/memoq/2014R2/help-en/index.html?livedocs.html

A corpus basically acts like a TM.


 

ulrika månsson
Sweden
English to Swedish
+ ...
did you check the filen in studio? Mar 23, 2015

in my experience, studio counts the locked segments in the analysis, but when you open the files, they actually are locked, and studio skips them during translation.
kind regards,
Ulrika


 

Huw Watkins  Identity Verified
United Kingdom
Local time: 22:45
Member (2005)
Italian to English
+ ...
TOPIC STARTER
Agreed Mar 23, 2015

ulrika_m wrote:

in my experience, studio counts the locked segments in the analysis, but when you open the files, they actually are locked, and studio skips them during translation.
kind regards,
Ulrika


You are confirming my initial suspicions that Studio counts locked segments and it puts my mind at rest as, as you say, studio does indeed jump over the locked segments. I just find it curious that Studio would count locked segments as no match words. Anyway I think this issue is resolved now.


 

pkolar
Slovenia
English to Slovenian
+ ...
Memoq vs Trados Aug 4, 2015

The issue is quite far from resolved actualy. I work with memoq and trados a lot and there is always a difference in analysis (sometime a big one). A client for example sends a Trados package with their analysis, I import the project in memoQ, but always get a very different analysis, which is really weird. Trados and memoQ have a very different way of calculating matches and counting words and I am talking about packages with no internal matches on either side (memoq or Trados) or with internal matches, it makes no difference. Since memoQ has a differnet wordcount calculation than Trados this results in different matches in analysis as well.

Did anyone else notice this?

Kind regards,
Safex


 

Saskia de Korte
Netherlands
Local time: 23:45
English to Dutch
memoQ vs Trados Mar 23, 2016

Came across this thread after googling because I experienced a similar problem. My client provided me with an analysis made in MemoQ that showed only around 1200 no matches. When I ran an analysis of the same file with the same TM in Trados Studio 2015, there were over 2500 no matches. There are no locked segments in the file. I have no clue what causes this. A no match is a no match, right?

 

Stepan Konev  Identity Verified
Russian Federation
Local time: 00:45
English to Russian
The last line counts from 50% in memoQ Mar 23, 2016

Saskia de Korte wrote:

...in MemoQ that showed only around 1200 no matches... in Trados Studio 2015, there were over 2500 no matches


In memoQ, it is fair enough to treat the last fuzzy match line (50-74%) as No match. If you add this number to no match, you will get roughly the same word count as in Trados nomatch.
Trados uses 70% as minimum match by default.
Thus, all fuzzy matches between 50% and 69% go to No match in Trados, while in memoQ they go to 50-74%.

[Edited at 2016-03-23 17:47 GMT]


 

Bernhard Sulzer  Identity Verified
United States
Local time: 17:45
English to German
+ ...
No fuzzy discounts Mar 23, 2016

Stepan Konev wrote:

Saskia de Korte wrote:

...in MemoQ that showed only around 1200 no matches... in Trados Studio 2015, there were over 2500 no matches


In memoQ, it is fair enough to treat the last fuzzy match line (50-74%) as No match. If you add this number to no match, you will get roughly the same word count as in Trados nomatch.
Trados uses 70% as minimum match by default.
Thus, all fuzzy matches between 50% and 69% go to No match in Trados, while in memoQ they go to 50-74%.

[Edited at 2016-03-23 17:47 GMT]


This is one reason one shouldn't agree to any discounts for any kind of segment matches calculated by a machine.

If you work with a CAT tool, make sure you know the settings for fuzzy and any other word counts YOU want to check yourself. I wouldn't just accept anyone's count.
Remember that it makes a difference if you use/are asked to use no TM, a good TM, or a crappy MT-generated TM to determine these matches. Whatever the matches are, you need to examine the original text in its entirety and context like we did in the old days, and, yes, still do, and estimate the work that's required, including all important aspects (your subject knowledge, complexity, fairly repetitive or non-repetitive content, deadline etc.)
Remember that a match, even with regard to segments in a TM provided to you says nothing about the quality of the TM, the degree of the actual match for the target text or the quality of the previous target text. A TM that is provided to you would have to be examined by you for its good or poor quality, a big extra step. You might elect not to work at all with TMs that are/were not created by you.

Finally, you should never quote discounted rates for fuzzies etc. - period/full stop - since they are based on matches calculated by a machine (CAT tool). Even if you did the analysis with your own CAT tool, know exactly what the analysis means and what it doesn't mean before you examine the other factors and quote a price for the project.

My recommendation is to never quote based on perfect and imperfect matches schemes by machines. Chances are all you do is working into the pockets of an unscrupulous agency, make much more work for yourself that is not paid at all and set yourself up for deadlines you can't keep.

Quote an overall price or, if you have to, a price based on the total words, taking into consideration all pertinent aspects of the project. At least that's what I do and it has served me well.


[Edited at 2016-03-24 04:40 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Baffling difference between memoQ and SDL Trados Studio word counts

Advanced search







TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search