Very different analysis results of same source for different languages
Thread poster: Mikhail Kropotov

Mikhail Kropotov  Identity Verified
Russian Federation
Local time: 09:51
Member (2005)
English to Russian
+ ...
Mar 9, 2015

I'm using MemoQ 2013 R2 to coordinate translations of the same source file into four languages: Russian, French, German and Spanish. The source file includes software strings in .po format for localization.

I've had the file completely translated into all four languages and all translations have been saved to their respective TMs. Then the development team had to change a couple of strings in the source file. Running statistics on the new updated file, which is essentially all the same as before, I get two very different kinds of analysis results.

For Russian, Spanish and French I get:

Type ----------- Segments --- Source words
=============================
All ------------- 782 ----------- 2962
Repetition --- 0 --------------- 0
101% -------- 765 ------------ 2896
100% -------- 12 -------------- 35
95%-99% --- 0 --------------- 0
85%-94% --- 0 --------------- 0
75%-84% --- 2 --------------- 7
50%-74% --- 2 --------------- 9
No match ---- 1 --------------- 15

But for German, on the same source file, I get:

Type ----------- Segments --- Source words
=============================
All ------------- 728 ------------ 2698
Repetition --- 11 -------------- 25
101% --------- 227 ----------- 807
100% --------- 302 ----------- 881
95%-99% --- 18 -------------- 42
85%-94% --- 26 -------------- 136
75%-84% --- 65 -------------- 346
50%-74% --- 112 ------------- 519
No match -----21 -------------- 206

Could someone please explain this drastic difference, or maybe tell me where to look next to understand what causes it?

Thank you in advance for any ideas.

[Edited at 2015-03-09 15:11 GMT]


Direct link Reply with quote
 

Rossana Triaca  Identity Verified
Uruguay
Local time: 03:51
Member (2002)
English to Spanish
Segmentation Rules Mar 10, 2015

First thing that came to mind, given the different number of segments/words, is an issue with the segmentation rules.

Are you sure you are using the same segmentation rules for the source file for all the analyses? These are usually given by the source language, but if you opened/edited the file and changed the language (or codification, or wrapping) mid-way this could explain the difference.


Direct link Reply with quote
 

USTranslation
United States
Local time: 23:51
English
Homogeneity/reps take precedence Mar 10, 2015

Здравствуйте, Михаил!

Differences in analysis numbers may come from several things:

First, you could double check to see if your TM really has the source segments you are analyzing. "Export to TMX" then view with Okapi Olifant http://okapi.sourceforge.net/downloads.html

If all your translations are there, you could check the following:

"Project TMs and corpora" checkbox may remain unchecked, causing the analysis to bypass the TM
The checkbox "Homogeneity" was checked, enabling fuzzy matches from within the project with no TM
The checkbox "Repetitions take precedence over 100% matches" was unchecked, causing all repetitions to be counted as 100% matches

Make sure "Project TMs and corpora" is checked. "Homogeneity" should be unchecked. "Repetitions take precedence over 100%" should be checked. "Disable cross-file repetitions" should also be checked.

Some of these options may only be available in memoQ 2014 R2 but I'm not sure.

Good luck!

Nick Lambson
U.S. Translation Company


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Very different analysis results of same source for different languages

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search