Pages in topic:   [1 2] >
Make sure agencies leave "Homogeneity" switched OFF when running statistics in memoQ!
Thread poster: Michael Joseph Wdowiak Beijer

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 04:01
Member (2009)
Dutch to English
+ ...
Aug 20, 2015

Just tweeted:

"Feel kind of bad re giving an agency a "2" on the #Proz Blue Board, but leaving "Homogeneity" ON in statistics in #memoQ is unforgivable.

PS: "#Homogeneity" is also known as "internal/virtual fuzzy matches" or "weighted words". Check that it is always switched OFF.

This is the culprit: http://www.proz.com/blueboard/ ××××× (××× ××× B.V.)
"

https://twitter.com/michaelbeijer/status/634411020476354560

[Edited at 2015-08-20 17:13 GMT]


Direct link Reply with quote
 

Rossana Triaca  Identity Verified
Uruguay
Local time: 01:01
Member (2002)
English to Spanish
Wait, what? Aug 20, 2015

Weighing rates according to internal fuzzy matches (or whatever name the CAT dujour calls it) has been the standard industry practice for several years now.

How this weighing is done (i.e., the discount matrix) is strictly a business matter between the involved parties, but unless they want to change it covertly after reaching and agreement I'd never consider it shady.

Bear in mind you can use different metrics to estimate the amount of work (as many do, charging by hours, pages, lines, etc.), and adjusted words are just that, one more metric...


Direct link Reply with quote
 

Philippe Etienne  Identity Verified
Spain
Local time: 05:01
Member
English to French
It's more insidious Aug 20, 2015

Rossana Triaca wrote:

Weighing rates according to internal fuzzy matches (or whatever name the CAT dujour calls it) has been the standard industry practice for several years now.

How this weighing is done (i.e., the discount matrix) is strictly a business matter between the involved parties, but unless they want to change it covertly after reaching and agreement I'd never consider it shady.

Bear in mind you can use different metrics to estimate the amount of work (as many do, charging by hours, pages, lines, etc.), and adjusted words are just that, one more metric...

Granted, weighted wordcounts and "Trados grids" have been around for at least as long as I've been a translator. Michael may have brought some confusion comparing Homogeneity (Memoq)/internal fuzzy matches (Trados I think) with "weighted words", which means the product of the CAT analysis matrix with the "Trados grid" matrix, i.e. the basis of fees for translators accepting CAT discounts.

What Michael refers to is the Homogeneity checkbox in MemoQ - or the Internal fuzzy matches checkbox in SDL Trados -, whereby new segments are dynamically checked against previous segments of the same document, converting any new segment into a fuzzy match if it happens to be similar to an earlier No-match segment in that non-translated document.
Bottom line: lower weighted wordcounts.

And I understand how upsetting this thing is, because I imagine he's known the Trados-without-SDL era.
In my early days, Trados did not have such option. And it was just so pleasant to realise underway that in fact you would spend less time than anticipated on a translation. The benefit of "internal fuzzy matches" was in our pockets, not agencies'.

SDLX did have that option, and I avoided working with that CAT tool for that very reason.

When Trados was bought by SDL, this feature was brought to grid-version SDL Trados, and I would venture that it is now becoming a norm for agencies to check that fee-cutting option whatever the CAT tool.

Basically, it's a checkbox that decreases weighted wordcounts (compared to old Trados) with no other benefit whatsoever.

When that thing became widespread, I used to reply to agencies with not only wy own discount grid, but also two word rates, one with the option unchecked and another about 15% higher with the option checked. Now I can't be bothered, I just advertise the higher one.

Philippe

[Edited at 2015-08-20 18:59 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 04:01
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
Not on my watch! Aug 20, 2015

Rossana Triaca wrote:

Weighing rates according to internal fuzzy matches (or whatever name the CAT dujour calls it) has been the standard industry practice for several years now.

How this weighing is done (i.e., the discount matrix) is strictly a business matter between the involved parties, but unless they want to change it covertly after reaching and agreement I'd never consider it shady.

Bear in mind you can use different metrics to estimate the amount of work (as many do, charging by hours, pages, lines, etc.), and adjusted words are just that, one more metric...


It is not standard industry practice as far as I am concerned, and the minute a client tries it, they are out of my good book.

If there is a TM, I may consider offering discounts in accordance with my fuzzy discount matrix*.
If there is no TM, they may be eligible for repetitions (@ 15%), but no fuzzy matches. Sadly, this company also pays 0% for repetitions.

In general, fuzzy matches are often actually all but useless, and unless very high (95% and over), it is often faster to just translate the segment from scratch.

The company that occasioned my tweet and this post here on Proz started off by asking me to translate a file of around 6000 words. They then somehow managed to remove all kinds of things that "didn't need translating" (and in the process also stripping out pretty much all of the context that helps make sense of a text). They then imported this file (by now it was no longer a .docx) into memoQ, and asked me to check out a memoQ server project. I opened the project and ran statistics on it, and got around 2000 words.

As an aside, note that this is already worse than any count I would get in my preferred CAT tool (CafeTran), as memoQ (and SDL Studio) consistently produces lower total word counts than CafeTran (or OmegaT).

However, when they ran statistics in memoQ, they switched on "Homogeneity", which magically reduced the already greatly reduced count of 2000 count to 1100 words. Great, so the original file of 6000 words, was now 1100 words. Guess what their end client will end up paying for? How much do you want to bet that the end client has never even heard of "weighted words" and "repetitions", and that the agency is pocketing all of the savings?

To put all this in context, there are plenty of other agencies out there that:

– pay a much higher word rate,
– pay around 15–20% for repetitions,
– only ask for fuzzy discounts if there is a TM (and if so, employ a sensible matrix; see below), and
– wouldn't dream of paying me less for these idiotic virtual fuzzies.

Obviously, we are all free to choose how to run our own business, but these penny-pinchers aren't exactly interested in building long-term relationships with their translators – or "vendors" as they like to call us in their marketing mumbo-jumbo – and can expect people like me to fight tooth and nail for my rights in our ailing industry, which is quickly being destroyed on all sides.

Michael

----------------------
*My fuzzy discount matrix (only applies if there is a TM):

100% + Repetition + 101%: 15%
95-99%: 60%
Everything else: 100%
No Match: 100%


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 04:01
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
Thanks for the clarification! Aug 20, 2015

Philippe Etienne wrote:

Rossana Triaca wrote:

Weighing rates according to internal fuzzy matches (or whatever name the CAT dujour calls it) has been the standard industry practice for several years now.

How this weighing is done (i.e., the discount matrix) is strictly a business matter between the involved parties, but unless they want to change it covertly after reaching and agreement I'd never consider it shady.

Bear in mind you can use different metrics to estimate the amount of work (as many do, charging by hours, pages, lines, etc.), and adjusted words are just that, one more metric...

Granted, weighted wordcounts and "Trados grids" have been around for at least as long as I've been a translator. Michael may have brought some confusion comparing Homogeneity (Memoq)/internal fuzzy matches (Trados I think) with "weighted words", which means the product of the CAT analysis matrix with the "Trados grid" matrix, i.e. the basis of fees for translators accepting CAT discounts.

What Michael refers to is the Homogeneity checkbox in MemoQ - or the Internal fuzzy matches checkbox in SDL Trados -, whereby new segments are dynamically checked against previous segments of the same document, converting any new segment into a fuzzy match if it happens to be similar to an earlier No-match segment in that non-translated document.
Bottom line: lower weighted wordcounts.

And I understand how upsetting this thing is, because I imagine he's known the Trados-without-SDL era.
In my early days, Trados did not have such option. And it was just so pleasant to realise underway that in fact you would spend less time than anticipated on a translation. The benefit of "internal fuzzy matches" was in our pockets, not agencies'.

SDLX did have that option, and I avoided working with that CAT tool for that very reason.

When Trados was bought by SDL, this feature was brought to grid-version SDL Trados, and I would venture that it is now becoming a norm for agencies to check that fee-cutting option whatever the CAT tool.

Basically, it's a checkbox that decreases weighted wordcounts (compared to old Trados) with no other benefit whatsoever.

When that thing became widespread, I used to reply to agencies with not only wy own discount grid, but also two word rates, one with the option unchecked and another about 15% higher with the option checked. Now I can't be bothered, I just advertise the higher one.

Philippe

[Edited at 2015-08-20 18:59 GMT]


Indeed, Philippe , I meant "homogeneity" (Memoq) / "internal fuzzy matches" (SDL), not "weighted word count" (e.g., as embodied in a so-called CATCount, which I use in TO3000 to calculate my total fee for a job).

I remember telling Kilgray that this would lead to problems for us translators (back when they introduced their "homogeneity" switch), years ago in the memoQ list, and I was right: unscrupulous agencies are using it more and more these days. And sadly, many translators don't know about or understand it, getting royally $cr€w€d in the process.

Michael


Direct link Reply with quote
 

Annamaria Amik  Identity Verified
Local time: 06:01
Romanian to English
+ ...
No discount for internal fuzzies Aug 20, 2015

Michael Beijer wrote:

It is not standard industry practice as far as I am concerned, and the minute a client tries it, they are out of my good book.

In general, fuzzy matches are often actually all but useless, and unless very high (95% and over), it is often faster to just translate the segment from scratch.


It is absolutely not a standard practice. Even my most saving-minded clients omit the internal fuzzies in their analyses. Of course, my best clients only ask for discounts on repetitions, no fuzzy discounts even when we use previous translations

Yes, fuzzies are mostly useless, because sometimes reading an existing translation and finding the differences takes even more time than translating from scratch.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 05:01
English
Not quite... Aug 20, 2015

Philippe Etienne wrote:

And I understand how upsetting this thing is, because I imagine he's known the Trados-without-SDL era.
In my early days, Trados did not have such option. And it was just so pleasant to realise underway that in fact you would spend less time than anticipated on a translation. The benefit of "internal fuzzy matches" was in our pockets, not agencies'.



... Trados had this too using the "Previous TM" feature (little known power user feature I think), and I believe other tools took if from there, maybe even SDLX too (although I'm not sure what came first in that case). When Studio was first released it didn't have it and we were under pressure from users to make it available again as an option. In fact I think we may have even added it after memoQ already had their homogeneity solution.

Either way, it's nothing new and I doubt it's just an SDL instigated feature. It was more of an industry driven one because the tools are intended to satisfy as many requirements as possible.

Regards

Paul
SDL Community Support


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 05:01
Member (2006)
English to Afrikaans
+ ...
@Rossana Aug 20, 2015

Rossana Triaca wrote:
Weighing rates according to internal fuzzy matches ... has been the standard industry practice for several years now.


Well, I guess you're right -- we're getting old. Trados 2009 SP2 introduced it in 2010. Wordfast Classic users were able to do so with PlusToyz, since about 2005. Metatexis could do it since mid-2004. I'm told that SDLX could do it even earlier but I was unable to pin down a date. Wordfast Pro and MemoQ only followed very recently.

Whether it is "standard practice" is another matter, however. Only two of my clients apply internal fuzzy matching to the weighted word counts (fortunately I can see the analyses, so I can simply adjust my per-word rate for every project accordingly). If I recall correctly, both WFP and Trados 2009+ spokespeople have spoken out against using internal fuzzies for word weighting.

I say this every time this discussion comes up, so I'll say it again: logically speaking, is makes more sense to offer discounts for internal matches than for external fuzzy matches, because internal fuzzy matches are more reliable (since you yourself created them), and external fuzzy matches need more careful checking (since you can't know for certain if they're from a reliable source). So although the standard practice is to request discounts for external fuzzy matches, we should really be offering lower discounts for external fuzzy matches than for internal ones.

How this weighing is done (i.e., the discount matrix) is strictly a business matter between the involved parties, but unless they want to change it covertly after reaching and agreement I'd never consider it shady.


You have a point. However, modern CAT tool import/export files don't always make it easy to check the word counts oneself, to verify the client's word count, and as a consequence we place great faith in the agency's honesty. So I can understand Michael's feeling of betrayal.



[Edited at 2015-08-20 19:37 GMT]


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 04:01
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
interesting post on "TRANSLATION TRIBULATIONS" (Kevin Lossner's blog) Aug 20, 2015

As usual, Kevin Lossner has written something interesting on this topic:

"[…]

To me, this sounds an awful lot like my PM acquaintance has been blind-sided by Kilgray's homogeneity analysis, which has been a feature of memoQ for a very long time. It's a feature about which I personally have mixed feelings. Used in the wrong way by unscrupulous agencies or ignorant persons, it can be yet another club with which to clobber translators and their rates to the ground and bring about the Hobbesian state of being so many fear is in our future, if not our present. But I approach it as a valuable information tool for helping me estimate how much time a rush project might actually take. Or in the case of my correspondent's competitor, it can be used judiciously to calculate a competitive rate that might not land you in the poorhouse.

[…]

Kilgray's memoQ analyzes a text for internal redundancies and "fuzzy redundancies", the latter being referred to as having a degree of "homogeneity". But as anyone who works with CAT software knows, even high fuzzy matches can be utterly useless and cost more time than content with no statistical similarities. Translation is about meaning, not statistics, and the price assassins at Trados and other tool pimps of the past sold everyone a lousy bill of goods with nonsense marketing lies like "You'll never have to translate the same sentence again." Well, guess what? If you do successive versions of an information brochure or technical manual and don't start to update your language after a while, your text will soon sound like it was written for an age long past and might not communicate as clearly as it should. Those who can read German should have a look at the various editions of the classic cookbook Die Süddeutsche Küche by Katharina Prato, which was popular from the mid-19th century until the 1930s for truly dramatic examples of the changes in a language. (These are available online via Google Books and various libraries online. They are also a good source of offal recipes - people ate all manner of interesting things back then.) But this happens on a much shorter time scale as well: my eight-to-ten-year-old texts for the AOK social insurance brochure and various IT manuals sound rather awful and dated, though they were quite acceptable at the time they were written.

[…]"

src: http://www.translationtribulations.com/2011/03/homogeneity-another-secret-competitive.html"


Direct link Reply with quote
 
Jacqueline White
Austria
Local time: 05:01
Hungarian to English
+ ...
I hadn't realised this wasn't common practice Aug 20, 2015

I'm not saying I like the practice, but I agree with Samuel that it actually makes more sense than giving a discount on a TM containing translations from another translator that might be completely useless.

If repetitions are going to be charged at 0% then they should at least be locked.

With short segments in particular, sometimes you can get an 85% hit or more when there is actually no real similarity with the segment in the TM (just based on the presence of words like "the" and "and" etc.)

I've increasingly noticed the tendency to lock all the "easy" bits of the text - numbers, addresses, repetitions etc. I actually find that quite annoying, because to my mind the "easy" parts and the "hard" parts should balance each other out. And often, the locked segments contain errors (e.g. the capitalisation is wrong) or need to be changed because they affect the preceding or following segment. In that case I usually leave comments, which takes longer than it would just to translate the segment.


Direct link Reply with quote
 

Philippe Etienne  Identity Verified
Spain
Local time: 05:01
Member
English to French
We had that discussion before Aug 20, 2015

SDL Community wrote:
... Trados had this too using the "Previous TM" feature (little known power user feature I think), and I believe other tools took if from there, maybe even SDLX too (although I'm not sure what came first in that case).

Dear Paul,
Thank you for your clarification.
I've only been a daily user of Trados for years at the beginning of this millenium. The "previous TM" analysis wasn't at all a tick-box specifically made to lower wordcounts, and I attempted to explain the difference in that old thread: http://www.proz.com/forum/cat_tools_technical_help/198526-which_cat_calculates_fuzzies_within_one_batch_of_files.html

As an implementation of the old SDLX option, the Internal fuzzy matches option is only meant to decrease wordcounts compared to old Trados. Although I agree with Samuel that those internal matches may be the ones that deserve discounts most, to me coming from Trados 3, this now not-so-new option is just another insidious way to decrease weighted wordcounts.

Philippe


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 04:01
Member (2009)
Dutch to English
+ ...
TOPIC STARTER
Feel free Aug 20, 2015

Whether Samuel is right or wrong with his rather academic point about whether he prefers giving internal or external fuzzy discounts, they are all anathema to me, and equally idiotic. I have a mortgage to pay and a family to feed.

You who are feeling more generous can offer discounts on imaginary matches with potential future repetitions in target sentences you may or may not write while sleeping sixth months and five days from your last birthday that didn't fall on a leap year, for all I care. I think I'll pass. These leeches can try some other $chmuck.

Michael


Direct link Reply with quote
 

Patrick Porter
United States
Local time: 23:01
Spanish to English
+ ...
Exactly! Aug 20, 2015

Samuel Murray wrote:
...Only two of my clients apply internal fuzzy matching to the weighted word counts (fortunately I can see the analyses, so I can simply adjust my per-word rate for every project accordingly)...


That about sums it up pretty good.

I don't have a problem with weighted word counts, even internal matching, but I've always negotiated my own matrix and always insist on weighting anything under 85% fuzzy match at 100% of my rate, among other things. This arrangement tends to fit the kinds of work I do.

But ok, if you don't like working with that structure then it's your right to set your fees as you like. And it can definitely be annoying when clients don't want to respect that.


Direct link Reply with quote
 

Rossana Triaca  Identity Verified
Uruguay
Local time: 01:01
Member (2002)
English to Spanish
Financial sense... Aug 21, 2015

So many points, so little time!

First of all, y'all making me feel really old, because good ol' Déjà vu had it first over a decade ago, and even before that you could always get this self-matching analysis in some roundabout way with some tinkering.

I understand its popularity rose (and pearls were clutched) when the market leader (Trados) included it too as a separate option, but again, even then it was possible to get this internal analysis in Trados differently, and big outfits were already using it for pricing (don't hate me @Paul, but this option was as revolutionary as the magical Autosave included in some 2009 service pack or the other by the time it you guys included it ).


It is not standard industry practice as far as I am concerned (...)


I don't think that's quite how it works... 

Don't shoot the messenger, I'm just relaying what is actually now the standard for agencies that demand a specific CAT as part of their workflow (keyword: demand). Then again, there's a whole other market outside of these, and I obviously support you 100% to send them packing if it doesn't make financial sense to you to accept their terms. However, I think we can agree that an agency that uses this option nowadays is not trying to embezzle you and doesn't deserve public shaming for it.*

I didn't quite understand why you would call them "imaginary" or "virtual" fuzzies - they are pattern searches of the TM you'll populate while doing the project, so there's nothing unsubstantial about them. Also, I agree with @Samuel that in my mind internal fuzzies always made much more sense than fuzzies against an unbeknownst TM which could have been filled by our dear friend, the almighty Alphabet Translates.

Internal fuzzies/homgeneity/intra-project analysis or the what-have-you method of estimating work have their pros and cons. The cons are obvious if you were banking on them as your gain for using/paying for the CAT, the pros are that all the statistical information you can gather (and I really do mean all) allows you to make a much more informed estimate of the work ahead.

Blanket rates? Bad idea... Whenever this comes up, I think that if the discount is the issue, then your rate was too low to begin with. Mind you, I use internal fuzzies for my direct clients too; for technical texts they *know* how much copypasta they had to do for producing the text, and they appreciate the honesty and transparency of my estimates. It's a win-win scenario too, because they become repeat clients and are happy to pay me my rates without question rather than go with a cheaper translator who will ignore the exact makeup of their texts. No doubt it's psychological, but a nice breakdown of where their money goes to goes a long way to loosen the purse strings.

All this being said, I've never worked (nor I expect I will) with an agency that doesn't pay at least a token 10% for repetitions, just for the sake of them *being there* as context and needing to be processed along the rest. Also, such a piecemeal work is bad news for the end-client, and when the things go south (and they will) your name will be on the line... I'd definitely decline their generous offer, explaining why.

---

*Let's save our righteous wrath for non-paying agencies such as LinguTek (not their real name, it's clearly a different name like Gnarnia, lest the Godz be angered), that years later still owe me 500 bucks. I'm sure the check is in the mail!


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 05:01
Member (2006)
English to Afrikaans
+ ...
@Michael, moving the goalposts Aug 21, 2015

Michael Beijer wrote:
Whether Samuel is right or wrong with his rather academic point about whether he prefers giving internal or external fuzzy discounts, they are all anathema to me, and equally idiotic.


No... my post was in reply to your original point, which is not about the acceptability of discounts in general, but about discounts specifically for internal fuzzy matching. You did not give that agency a low BB rating simply because they wanted/applied discounts, but specifically because they applied discounts on internal matches, and without asking you.


Direct link Reply with quote
 
Pages in topic:   [1 2] >


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Make sure agencies leave "Homogeneity" switched OFF when running statistics in memoQ!

Advanced search






CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search