What does "Save up to 20% on homogeneity" mean?
Thread poster: Heinrich Pesch

Heinrich Pesch  Identity Verified
Finland
Local time: 00:16
Member (2003)
Finnish to German
+ ...
Jan 8, 2010

What does that mean? I found this sentence at the MemoQ homepage. Sounds to me like 20 % more curly hair on a bald head.
Regards
Heinrich

[Bearbeitet am 2010-01-08 11:09 GMT]



[Subject edited by staff or moderator 2010-01-10 01:09 GMT]


 

Epameinondas Soufleros  Identity Verified
Greece
Local time: 00:16
Member (2008)
English to Greek
+ ...
Here's what it is Jan 8, 2010

This is the relevant extract from MemoQ's help:

Homogeneity: Analysis against the segments within the selected scope is called homogeneity analysis. This is one of MemoQ's power features. Check this checkbox to emulate building a translation memory during translation, and see the savings that will result from the internal similarities within the project. Using homogeneity, you are able to see the benefits of your future contribution – i.e. the contribution while you will be translating – to the translation memory.

Note: Using homogeneity, you are able to give a much better estimation of your resources to be spent on translation than without homogeneity. If you use the analysis to give a quotation, always look for the aggregate results as they reflect the real productivity gain through using MemoQ.


As you can see, it's nothing like curly hair on a bald scalp. It's advanced computational linguistics, nothing Trados can even come close too, I think.


 

SDL Community  Identity Verified
United Kingdom
Local time: 23:16
English
Trados capability? Jan 9, 2010

Epameinondas Soufleros wrote:
It's advanced computational linguistics, nothing Trados can even come close too, I think.


Hi,

I am interested to understand what this is too. It looks like an ability to forecast the benefit you get from repetitions as they become part of your TM which of course has been part of Trados for years, but I would welcome a more complete explanation if it is something more?

Thanks

Paul


 

KSL Berlin  Identity Verified
Portugal
Local time: 22:16
Member (2003)
German to English
+ ...
Not quite Jan 9, 2010

Paul Filkin wrote:
... an ability to forecast the benefit you get from repetitions as they become part of your TM which of course has been part of Trados for years


Not at all, Paul. Trados has nothing like this feature. We aren't talking about repetitions as you understand them and as they are measured by Trados, DV, Practicount, etc. etc.

The homogeneity analysis as I understand it shows the "fuzzy repetitions" (my own term); actual 100% repetitions of content are measured by almost every tool these days.

If a "no match" sentence is in fact similar to another "no match" sentence but neither has any match (fuzzy or otherwise) in the TM, the homogeneity analysis will indicate the degree of similarity of the unknown sentences in a way similar to how a fuzzy match is identified with respect to the TM. From a programming standpoint, it's not a particularly astounding feat, but as far as I know no company before Kilgray has implemented this. I find it very useful for making more realistic estimates of effort when working under a tight deadline.

There are two potentially bad sides to this. Agencies using a homogeneity analysis might try to use that information to squeeze translator's rates eve further. In fact, at the MemoQ Fest in 2009, one presenter from France admitted to doing so, and I called attention to this in clear language to be sure that no one had missed the point. If there had been a rope in the room and a convenient tree nearby, I think I know what activity would have happened on the break. The other "bad" side is that the pleasant surprise of finishing a translation much earlier than one estimated based on a quantitative understanding of one's working speed is eliminated. There have been days when I had "wind in my sails" and translated 1000 words per hour in files with almost no fuzzies or repetitions. Only with the homogeneity analysis do I understand that the high fuzzies were there all along, just not identifiable as such by other tools.

If this were an SDL feature, I'm sure it would be marketed by pointing out how much money customers can "save". Instead, Kilgray markets it more by pointing out how much better you can plan. A difference of philosophy that contributes to my positive attitude toward the team and their product.

[Edited at 2010-01-09 22:18 GMT]


 

SDL Community  Identity Verified
United Kingdom
Local time: 23:16
English
Thanks for the clarification Jan 9, 2010

If this were an SDL feature, I'm sure it would be marketed by pointing out how much money customers can "save". Instead, Kilgray markets it more by pointing out how much better you can plan. A difference of philosophy that contributes to my positive attitude toward the team and their product.


Hi Keven,

Thank you for the clarification on the fuzzy repetitions, that certainly makes it a lot clearer for me. Actually SDLX has had this capability for years, and it has always been a point of contention for us with Studio whether we ported this ability across or not for all the things you mention.

However, watch this space, he said in a whispericon_wink.gif

Regards

Paul


 

KSL Berlin  Identity Verified
Portugal
Local time: 22:16
Member (2003)
German to English
+ ...
Just do it Jan 10, 2010

SDL Support wrote:
Actually SDLX has had this capability for years, and it has always been a point of contention for us with Studio whether we ported this ability across or not for all the things you mention.

However, watch this space, he said in a whispericon_wink.gif


Really? Good to know. My ex loved SDLX, but I hated the "format painting", so I soon ignored my license.

I would say please migrate this useful feature to Studio 2009 but exercise judgment in how it is marketed. Since your popular competition already has the feature, it's not like there can be all that much stone throwing. And while you're at it, throttle the people who promote fuzzy scales as a way to "save money". The picture as you know is much more complex than that. It often takes me longer to alter a 90% fuzzy than to write a new translation. My main emphasis when discussing such tools is QA. The "save money" line should really (and responsibly) be restricted to context matching and its application elsewhere should be highly qualified.


 

Stefan de Boeck (X)  Identity Verified
Belgium
Local time: 23:16
English to Dutch
+ ...
or don't Jan 10, 2010

Kevin Lossner wrote:
…, the homogeneity analysis will indicate the degree of similarity of the unknown sentences in a way similar to how a fuzzy match is identified with respect to the TM.


This has always been possible, to a certain extent, in Trados, by using the Use previous TM feature.
I am not going to detail how precisely this can be done; I rather like a bit of innocence or downright ignorance in any agency.
Still, through time I have often wondered How come they don't pick up on this; a bit like looking at the leaning tower of Pisa and wondering Why doesn't it just topple over then? Not that you'd want it to.
So when finally there was this agency (from France) that found out how to use this feature I thanked them by raising my rent.

Kevin Lossner wrote:
And while you're at it, throttle the people…


Given the idiomatic meaning of at full throttle, perhaps you should have written put a chokehold on.


 

KSL Berlin  Identity Verified
Portugal
Local time: 22:16
Member (2003)
German to English
+ ...
Throttling Jan 10, 2010

Stefan de Boeck wrote:
Given the idiomatic meaning of at full throttle, perhaps you should have written put a chokehold on.


That's a different idiomatic expression; no chance of confusion for anyone with a reasonable understanding of English. And while your suggestion can, if incorrectly applied, have the same consequences, I generally applied chokeholds when I wanted to gain control without injuring the other person.

The tools are out there and the method is accessible - even for free, if you use an unlicensed version of MemoQ (aka MemoQ4Free). What's important, really, is that the tool vendors finally grow up and start promoting these techniques in a responsible way.

Homogeneity really isn't any more controversial for me than ordinary fuzzy match issues. Fuzzy matches save time sometimes, often they do NOT. I look at them strictly as a QA tool, and when in my judgment they are no help (timewise) or a hindrance, I charge a premium (or give no discounts) accordingly. If all some agency or end customer cares about is a discount scale, they can go fish in another lake. These things are not outside the scope of discussion, but they must be considered in the context of a project and relationship.

In the meantime many of us have enough solid experience dealing with the reality of fuzzy matches in practice that a certain amount of push-back is now possible and appropriate. Not useless whining about "declining rates" (which in fact might not be when considered as hourly throughput/earnings), but a discussion based on real data and sober analysis.


 

Stefan de Boeck (X)  Identity Verified
Belgium
Local time: 23:16
English to Dutch
+ ...
like Homer does Bart Jan 10, 2010

Kevin Lossner wrote:
That's a different idiomatic expression; no chance of confusion for anyone with a reasonable understanding of English.

Quite. So what about your average marketing dingbat? They're easily confused and may even have their belly buttons pierced for no apparent reason. But, anyway
Kevin Lossner wrote:
What's important, really, is that the tool vendors finally grow up and start promoting these techniques in a responsible way.

isn't that a bit like having a Drink in Moderation sticker on a bottle of Tequila? It's not the tool developers that worry me.
Kevin Lossner wrote:
... a discussion based on real data and sober analysis.

Right. And while you're at it, Kevin, would you please tell that snow outside to go away? It's been lying around there now for long enough.


 

István Lengyel
Hungary
Local time: 23:16
English to Hungarian
+ ...
stop dreaming, Kevin - I will protect him! :) Jan 11, 2010

Kevin Lossner wrote:

There are two potentially bad sides to this. Agencies using a homogeneity analysis might try to use that information to squeeze translator's rates eve further. In fact, at the MemoQ Fest in 2009, one presenter from France admitted to doing so, and I called attention to this in clear language to be sure that no one had missed the point. If there had been a rope in the room and a convenient tree nearby, I think I know what activity would have happened on the break.


Kevin, I like that guy a lot, so please keep your hands away from him, otherwise you won't be a welcome guest at the next conferenceicon_smile.gif

István


 

Vito Smolej
Germany
Local time: 23:16
Member (2004)
English to Slovenian
+ ...
;)) Jan 15, 2010

Stefan de Boeck wrote:
This has always been possible, to a certain extent, in Trados, by using the Use previous TM feature....
I rather like a bit of innocence or downright ignorance in any agency.


to be fair, my first reaction was a "huh!? ..." (I never had any use for that check box).

[Edited at 2010-01-15 15:01 GMT]


 

Hynek Palatin  Identity Verified
Czech Republic
Local time: 23:16
English to Czech
+ ...
Squeezing translator's rates Jan 15, 2010

Kevin Lossner wrote:
Agencies using a homogeneity analysis might try to use that information to squeeze translator's rates even further.


They already do it with other CAT tools. That's why I'm glad this feature is not included in Trados.


 

Elisabeth Bull  Identity Verified
Local time: 23:16
English to Norwegian
+ ...
Logoport / Translation Workbench has it as well Feb 8, 2012

When the Lionbridge tool Logoport was launched some years ago, they bragged about this new analysis feature that was used in Logoport that would supposedly save the project managers up to 20% on the projects. Or steal 20% from us translators as I would rather put it.

Elisabeth


 

David Turner  Identity Verified
Local time: 23:16
French to English
+ ...
Not too sure ... Feb 8, 2012

... how much of a "power feature" it is or whether it involves "advanced computational linguistics" but having calculated a homogeneity percentage, I would have thought it was relatively trivial to actually identify and extract the sentences concerned. Surprisingly enough, none of the main CAT tools seem to do it though. It's very nice to know that your document has 20% fuzzy matches without any reference to a TM, but you can't actually do much with this knowledge other than take an evening off and go to the cinema on the strength of the time you're hopefully going to save on the job.

I've tried to go a stage further with my humble little PhraseMiner tool. One of its modules does this:
"FuzzyMiner identifies and extracts "internal" or “intra-document” fuzzy matches, i.e. sentences in the current document that are similar to each other but which do not necessarily yet have a corresponding fuzzy match in a TM. While analysis against a TM often flags a disappointing number of fuzzy matches, there are often quite a few such "intra-document" matches in the average document. Some CAT tools give you a percentage analysis of such sentences or "homogeneity" analysis but do not identify or extract the actual sentences to let you work on them in one go. FuzzyMiner extracts and open these sentences in a new document and displays the first fuzzy in each series in normal font and the other "fuzzy repetitions" in each series in italics" underneath.
http://asap-traduction.com/PhraseMiner


 

Cristiana Coblis  Identity Verified
Romania
Local time: 00:16
Member (2004)
English to Romanian
+ ...
Frankly, I find it misleading Feb 8, 2012

I find it misleading as it is not clear on what the statement is based and it most certainly does not apply to all languages. I can be argued that "up to x%" can also mean zero or close to that...
I cannot speak for other languages, but for Romanian any fuzzy matches bellow 85% are more or less useless. In most cases, on a 80-85% fuzzy match I may end up having to rephrase almost all of it due to how Romanian language works. I use glossaries a lot and build them for most projects, but even so, in Romanian you cannot just insert a word or an expression into a phrase without having to modify it, the way you would do in English. The same goes for bits and pieces of phrases.
I would not manage my time based on CAT tool algorithms, let alone homogeneity stats. These are based on the source language and since, most of the time, the source is English, they may be an accurate representation of the translation effort if you had to translate from English into English, but these are quite inaccurate for my language. I hope that is useful for some languages, but I cannot say it is useful or interesting for Romanianicon_frown.gif

[Editat la 2012-02-08 21:58 GMT]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What does "Save up to 20% on homogeneity" mean?

Advanced search






BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search