What are cross-file repetitions?
Thread poster: Hin und Wieder

Hin und Wieder  Identity Verified
Netherlands
Local time: 07:12
Member (2012)
German to Dutch
+ ...
Jan 3, 2014

I received an Analyze Files Report, there were 3 files for the analysis.
At the total section it mentioned 8,000 repetitions and 3,000 cross-file repetitions.
Total wordcount around 75,000.
File 1 had just repetitions, as had file 2. File three showed 3,000 cross-file reps and around 300 normal reps. Is it so that when I start translating with file 1, than 2 and 3 at the end I will have a lot of repetitions in the last file, but when I start with file 3 I will have those cross-file repetitions in another file?
It confuses me...who can give me more information? And why aren't they counted for repetitions?


Direct link Reply with quote
 

Jan Willem van Dormolen  Identity Verified
Netherlands
Local time: 07:12
English to Dutch
+ ...
Yes, I think Jan 3, 2014

AIUI, you're correct. Cross-file reps are segments that appear in several files. So once you've translated one file, the next becomes easier. It shouldn't matter in what order you do them. The cross-file reps appear in file 3, because that's the order in which the files are analyzed.
You can experiment - rename the files so they get a different (alphabetical) order and run the analysis again. Again the cross-file reps should appear in the last file that's analyzed.


Direct link Reply with quote
 

José Henrique Lamensdorf  Identity Verified
Brazil
Local time: 03:12
English to Portuguese
+ ...
The complete story Jan 5, 2014

Angélique, I'll give you the complete story, so you'll get the picture.

My specialty in translation is management development courseware, been doing it since 1983, hence before CAT tools became popular.

A training program often comprises a course leader/instructor/facilitator's guide, participants' workbooks, miscellaneous handouts, and possibly PowerPoint presentations.

Several phrases are repeated throughout, as all these materials work together, and they must be identical, as well as their translations.

As computers gradually replaced typewriters in producing this material over the years, and since improved course leader's guides began including copies of all other materials to spare the instructor from rummaging through various sources, some of these clients found an obvious way to cut costs: they'd have me translate only the leader's guide, and then a sesquilingual staff member would copy & paste the repeated segments on the other publications. Of course, they'd also have to attempt translating whatever was not reproduced on the leader's guide, which sometimes compromised the overall quality.

As I joined the CAT tools trend, I began offering these repeated segments for free. Alt+Dn (WordFast) costs me nothing, and the client would hire me to translate the entire training package. They'd save on that staff member's time, and get the complete job from me.

The same occurs with parts-list type jobs. I had a long series of huge parts list to translate. About 70% of the segments were repeated, so I gave them for free, and yet it was quite profitable.

This also happens with instruction manuals of similar equipment.

While I don't give any discount for fuzzy matches, ever, giving repeated segments (when they are in fact to be identical) for free is often a mutually advantageous deal.


Direct link Reply with quote
 

neilmac  Identity Verified
Spain
Local time: 07:12
Spanish to English
+ ...
IMHO Jan 7, 2014

It's just yet another BS invented by intermediaries to try to screw as much as they can out of the people they exploit. Verbal jiggery pokery. Snake oil.

[Edited at 2014-01-07 12:38 GMT]


Direct link Reply with quote
 

Jan Willem van Dormolen  Identity Verified
Netherlands
Local time: 07:12
English to Dutch
+ ...
Sorry Jan 7, 2014

neilmac wrote:

It's just yet another BS invented by intermediaries to try to screw as much as they can out of the people they exploit. Verbal jiggery pokery. Snake oil.

[Edited at 2014-01-07 12:38 GMT]


Sorry, but, IMHO, that's just ignorant.
Cross-file reps are very useful, as anyone who has ever done catalogue translations can tell you.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 07:12
Member (2006)
English to Afrikaans
+ ...
Which CAT tool? Jan 7, 2014

Angelique Blommaert wrote:
I received an Analyze Files Report, there were 3 files for the analysis. At the total section it mentioned 8,000 repetitions and 3,000 cross-file repetitions. Total wordcount around 75,000.


Is this in Trados?

From what I can tell, Trados uses the term "cross-file repetitions" for normal repetitions, i.e. segments that repeat within the project as a whole (and not just within a single file), but I was not aware that Trados actually give those repetitions separately. It may be that the 8000 figure includes the 3000 figure. What is the unique word count?

Is it so that when I start translating with file 1, than 2 and 3 at the end I will have a lot of repetitions in the last file, but when I start with file 3 I will have those cross-file repetitions in another file?


I don't understand how file 1 can have 8000 repetitions (and contain ONLY repetitions) if there are only 3000 cross-file repetitions in the project. If file 1 contains only repetitions, then that would mean that they are all cross-file repetitions. Right? Or do you mean that file 1 contains more than 8000 words in total, but that 8000 of them are in repetitions?


Direct link Reply with quote
 

neilmac  Identity Verified
Spain
Local time: 07:12
Spanish to English
+ ...
Ignorance was bliss Jan 7, 2014

[quote]Jan Willem van Dormolen wrote:

neilmac wrote:

It's just yet another BS invented by intermediaries to try to screw as much as they can out of the people they exploit. Verbal jiggery pokery. Snake oil.

[Edited at 2014-01-07 12:38 GMT]


Sorry, but, IMHO, that's just ignorant.
Cross-file reps are very useful, as anyone who has ever done catalogue translations can tell you. I/quote]

My original post was intended as a jokey, post new-year comment. However, since it seems to have been taken seriously, perhaps I should explain my opinion.

As I understand it, the notion of reducing the translator's fees on the basis of repetitions, whether within one file or across several, is no more than a strategy frequently applied by clients - who are usually agencies - to whittle away at the translator's earnings. Please forgive me if I'm mistaken in this particular case.

[Edited at 2014-01-07 16:34 GMT]


Direct link Reply with quote
 

Hin und Wieder  Identity Verified
Netherlands
Local time: 07:12
Member (2012)
German to Dutch
+ ...
TOPIC STARTER
Points of view Jan 9, 2014

Thank you guys, for sharing your points of view. As I am still working on the project I can admit they are very usefull. I always check the conformation statistics and I was pretty surprised to see the % rise so fast.
The total count is not matching the reality though....there were supposed to be no fuzzies in the range of 50-99%, but the text is just full of matches, 96, 93, 95 etc. So the winner of today will be me, as I have a PO with a stated amount for this project, based on their analysis.


Direct link Reply with quote
 

SDL Community  Identity Verified
United Kingdom
Local time: 07:12
English
Now you’re opening another can of worms ;-) Jan 9, 2014

Angelique Blommaert wrote:

The total count is not matching the reality though....there were supposed to be no fuzzies in the range of 50-99%, but the text is just full of matches, 96, 93, 95 etc. So the winner of today will be me, as I have a PO with a stated amount for this project, based on their analysis.


Internal fuzzy matching... homogeneity... or whatever your favourite tool likes to call it!


Direct link Reply with quote
 

Little Woods  Identity Verified
Vietnam
Member
English to Vietnamese
Hi all, I have a case with cross file repetition and need your opinion Nov 11, 2014

My client send me an excel file and the analyze files report in doc format of an excel file with 3 sheet within. It came at 3252 repetition words and 34877 new words, total word count at 38129 words. My PO is made based on these number.

Then they split the file into12 smaller files because I can't work on the excel file with 3 sheets and for progress management. Each smaller file contains about 4000 words, some more and some less. I didn't notice any differences until I work on file 10 and realize it should have been completed after file 9 because each is about 4000 words.

I check the analyse file in XML format sent with the package. It says 34837 new words (tags), 1031 repeated word (tags), 10888 cross-file repeated words (tags), and the total word count at 46757 words.

It is strange so I ask them revise the PO but my client send the original analyze report of 38129 words and 3252 repetition words still and believe it as the ground for the PO.

in Trados, even though the repeated segs are auto-filled in, I still have to check it and it still takes time. And if the reviewer or client change something, I have to trace them all and replace them with the new changes.

I dont know how the cross-file repeated words make Trados count and the client count so different although the new words are the same. And in this case, should I ask the client to change the 3252 repeated word in the original calculation into 10888 words like in Trados. What is about the repeated words and cross-file repeated words.

How should I go about this?


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

What are cross-file repetitions?

Advanced search







WordFinder
The words you want Anywhere, Anytime

WordFinder is the market's fastest and easiest way of finding the right word, term, translation or synonym in one or more dictionaries. In our assortment you can choose among more than 120 dictionaries in 15 languages from leading publishers.

More info »
PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search