Should "fuzzy" words be grouped together with "InternalFuzzy" words in SDL analysis?
Thread poster: Enrique Cavalitto

Enrique Cavalitto
Local time: 12:34
SITE STAFF
Jul 22, 2016

Hi all,

I am reviewing a new feature for the translation center powered by ProZ.com that will enable the import of CAT tools analysis. We are initially working on SDL Trados analysis, but soon afterwards we will add MemoQ, Wodfast and other analysis formats.

When working with an example analysis, copied below, I find that for each "fuzzy" category, for instance "75% to 84%" there are two lines, one labeled "fuzzy" and the other labeled "internalFuzzy", each with its own wordcount.

My question is if when this analysis is used for quoting a job these fuzzy categories should be grouped together, or considered separately. In other words, if we assign a given rate per word for the 75% to 84% category, should the 45 words in "fuzzy" added to the 155 words in "internalFuzzy" to give a total of 200 words for the category?

And if this is not the case, how should them be treated?


  • perfect segments="0" words="0" characters="0" placeables="0" tags="0"

  • inContextExact segments="15" words="208" characters="1259" placeables="9" tags="0"

  • exact segments="1" words="2" characters="10" placeables="1" tags="0"

  • locked segments="38" words="547" characters="2997" placeables="14" tags="0"

  • crossFileRepeated segments="0" words="0" characters="0" placeables="0" tags="0"

  • repeated segments="0" words="0" characters="0" placeables="0" tags="0"

  • total segments="538" words="15043" characters="81550" placeables="213" tags="0"

  • new segments="83" words="2590" characters="13654" placeables="23" tags="0"

  • fuzzy min="50" max="74" segments="1" words="9" characters="55" placeables="1" tags="0"

  • fuzzy min="75" max="84" segments="4" words="45" characters="258" placeables="3" tags="0"

  • fuzzy min="85" max="94" segments="4" words="51" characters="241" placeables="1" tags="0"

  • fuzzy min="95" max="99" segments="372" words="11347" characters="61621" placeables="145" tags="0"

  • internalFuzzy min="50" max="74" segments="7" words="84" characters="492" placeables="4" tags="0"

  • internalFuzzy min="75" max="84" segments="12" words="155" characters="934" placeables="11" tags="0"
  • internalFuzzy min="85" max="94" segments="0" words="0" characters="0" placeables="0" tags="0"

  • internalFuzzy min="95" max="99" segments="1" words="5" characters="29" placeables="1" tags="0"




Thanks in advance for your help.

Regards,
Enrique


Direct link Reply with quote
 

Bernhard Sulzer  Identity Verified
United States
Local time: 10:34
English to German
+ ...
There's nothing fuzzy about fuzzies IMO Jul 22, 2016

Enrique Cavalitto wrote:

...

My question is if when this analysis is used for quoting a job these fuzzy categories should be grouped together, or considered separately. In other words, if we assign a given rate per word for the 75% to 84% category, should the 45 words in "fuzzy" added to the 155 words in "internalFuzzy" to give a total of 200 words for the category?

And if this is not the case, how should them be treated? ...



Please excuse me Enrique if I don't give you a simple technical answer but I feel I need to contribute a few fundamental thoughts and questions to your post.

I hold that fuzzies (no matter what type) and repeats and matches as a basis for a quote is not something I can condone or recommend. Such analyses usually are used by certain agencies to demand unacceptably low rates. Just because you have certain percentages of fuzzies (internal ones - it's IMO quite a ridiculous category and has been discussed in the forums!), repeats, or matches says nothing about the actual translation work or the complexity of the task involved, which must be carefully assessed by each translator before he/she provides a quote.

Can I ask what do you mean by "if WE assign a given rate ...?" Are you planning to recommend on that platform certain rate discounts for these fuzzies - in reality totally arbitrary aspects with regard to actual translation work? I don't see why that is necessary or a good thing.

Are you basing these "fuzzy" analyses on the principle of using an already existing "excellent" TM - which I don't think is the case? (Even matching segments with TM's wouldn't be fair as a basis for quotes, as I am sure of.)

TM or no TM, this kind of "machine"-analysis has nothing to do with comparable matches in the target text/translation that could be "predicted" or argued as having the same percentages and thus would give people the idea it's okay to provide quotes based on such analyses - which isn't okay anyway.

What is Proz.com trying to achieve with this for the good of our profession? Could you clarify please?

I really am concerned that the way you present this topic here implies that basing rates/prices on fuzzies and repeats and matches is completely legitimate and all we are talking about here are technical details, which I hold isn't right.

[Edited at 2016-07-22 17:19 GMT]


Direct link Reply with quote
 

Siegfried Armbruster  Identity Verified
Germany
Local time: 16:34
Member (2004)
English to German
+ ...
Don't group them by default Jul 22, 2016

Enrique Cavalitto wrote:

When working with an example analysis, copied below, I find that for each "fuzzy" category, for instance "75% to 84%" there are two lines, one labeled "fuzzy" and the other labeled "internalFuzzy", each with its own wordcount.

My question is if when this analysis is used for quoting a job these fuzzy categories should be grouped together, or considered separately. In other words, if we assign a given rate per word for the 75% to 84% category, should the 45 words in "fuzzy" added to the 155 words in "internalFuzzy" to give a total of 200 words for the category?

And if this is not the case, how should them be treated?


Hi Enrique,

Keep them separate, some might prefer to charge different rates or not to offer fuzzy rates for internal fuzzies at all.
The optimal solution would be an optional setting where the users could decide if they want to calculate their quote based on a grouped analysis or indivual. And this should be possible as a default setting for all quotes, but also as an individual setting per quote.

Best regards
Siegfried

[Edited at 2016-07-22 13:11 GMT]


Direct link Reply with quote
 

Emma Goldsmith  Identity Verified
Spain
Local time: 16:34
Member (2010)
Spanish to English
Separate Jul 22, 2016

Siegfried Armbruster wrote:

Keep them separate, some might prefer to charge different rates or not to offer fuzzy rates for internal fuzzies at all.


Definitely agree with Siegfried here. Mostly, agencies don't count internal fuzzies at all. To bundle them with fuzzy matches from a TM would be misleading (and have a negative impact from a translator's point of view).

The concept of internal fuzzies isn't unique to Studio, by the way. When you move on to memoQ, for instance, you'll find them classified as "homogeneity", with their own separate analysis.


Direct link Reply with quote
 

Roy Oestensen  Identity Verified
Norway
Local time: 16:34
Member (2010)
English to Norwegian (Bokmal)
+ ...
Not sure that is the case Jul 22, 2016

Emma Goldsmith wrote:

Definitely agree with Siegfried here. Mostly, agencies don't count internal fuzzies at all. To bundle them with fuzzy matches from a TM would be misleading (and have a negative impact from a translator's point of view).


I have discovered that at least some of the agencies I work for, do count interal fuzzies when preparing the POs, without having told me on beforehand. I've not been very pleased when discovering this. After all they tried to push the price as low as possible.

Of course the more serious agencies don't do this.


Direct link Reply with quote
 

Drew MacFadyen
SITE STAFF
Analysis based on cat tool parameters Jul 22, 2016

Im not a translator, so I'm coming at this from a sort of purely technical standpoint. I gather there is no industry standardization on analysis - some tools use Perfect match, others 101% match & how each tool determines that may not be the same, and I think some even allow users to set/manipulate those parameters on the freelance side.

I don't know enough about this technical implementation, but would it be possible to keep the analysis exactly as SDL does it and just reference the parameters (link to or display them) and state that this is SDL specific analysis. That would entail different analysis reports/displays for each tool....likely not development friendly.

Regards,

Drew


Direct link Reply with quote
 

Enrique Cavalitto
Local time: 12:34
SITE STAFF
TOPIC STARTER
Thanks a lot! Jul 22, 2016

Thanks friends, this is really useful.

The InternalFuzzy 75% to 84% are part of the word count, so you either consider them as part of that category or you add them to a different category so they are included in the project word count.

Regards,
Enrique


Direct link Reply with quote
 

Lorenzo Bermejo
Local time: 16:34
English to Spanish
+ ...
75% threshold Jul 23, 2016

In my case, the agency I work for prepares their budgets considering all fuzzies below 75% as new words.
And for example, 95-99% fuzzies are paid at 10 or 12%.
I reckon they also bundle into this the internal fuzzies, but I wouldn't say it's unfair.
Of course, they pretranslate with reliable TMs, or pay extra "man labour" when 100% matches have to be reviewed.


Direct link Reply with quote
 

OMNIAGE
Bulgaria
Local time: 17:34
Member (2004)
English to Hungarian
+ ...
Flexible options to choose, including weighted word count Jul 27, 2016

Hi Enrique,

I think it will be best for the user to have the option to quickly choose whether or not to include those options, as well as to set the percentage of each parameter by him/herself. If those parameters (like internal matches,etc.) need to be present, each category should be listed separately so that it is easier to understand. I would highly recommend to have the weighted word count option as well, which calculates total volume based on the different weight each fuzzy category has.

Kind regards,
Zhana


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Should "fuzzy" words be grouped together with "InternalFuzzy" words in SDL analysis?

Advanced search







Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search