Fragment repetition count without translation memory
Thread poster: Mar Jiménez Quesada

Mar Jiménez Quesada
United Kingdom
Local time: 05:16
English to Spanish
+ ...
Apr 11, 2017

Let´s assume that we have a file to translate that contains repetitions but there is no translation memory, so we are looking at the number of repeated fragments within the file. Does the software include in the repeated fragment count the first instance of a repeated fragment (i.e., the one that you will have to translate from scratch).
For example, if you have the fragment "xxxxab", which appears 3 times, although the repeat count is 3, one of those will have to be translated from scratch so it should not be counted as a repetition. Will the software then place 2 of those fragments in the repeated count and the other one in the "new fragment" count?


 

Lianne van de Ven  Identity Verified
United States
Local time: 00:16
Member (2008)
English to Dutch
+ ...
It depends Apr 11, 2017

on the software and on how you define a "fragment". Repeated words should not be counted as repeated content (although some software seems to "think" so). Repeated fragments should follow certain rules, although I don't know which ones - it depends on the software.

 

Emma Goldsmith  Identity Verified
Spain
Local time: 06:16
Member (2010)
Spanish to English
Segment or fragment? Apr 12, 2017

Mar Jiménez Quesada wrote:

Does the software include in the repeated fragment count the first instance of a repeated fragment


By default, a segment is a standalone unit that usually ends with a full stop. So a sentence is the most common-sized segment.
A fragment is a chunk or group of words within a segment. It isn't a standalone unit.

In the above explanation there are 4 segments. "a standalone unit" is a fragment that appears in two of those segments.

I expect you're referring to segments. In that case, where there are four identical segments, Studio considers the first as new content and the other three as repetitions.
If you're talking about fragments, then if the fragment is long enough (supposing it accounts for 80% of the segment), then it will be picked up as a fuzzy match. To get Studio to account for repeated fragments (fuzzy matches) within a file and without a translation memory, simply mark "report internal fuzzy match leverage" in the Analyze Files batch task. Again, Studio will consider the first segment as new content, and subsequent "fuzzies" as such.

In Studio 2017, fragments are processed in a new way, which I discussed in this post:
https://signsandsymptomsoftranslation.com/2016/11/17/studio-2017-uplift/


 

Lianne van de Ven  Identity Verified
United States
Local time: 00:16
Member (2008)
English to Dutch
+ ...
Well... Apr 14, 2017

I totally missed in which forum this question was asked.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Fragment repetition count without translation memory

Advanced search







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search