Cross file analysis for fuzzy matches in Studio 2011
Thread poster: rcorbeil

rcorbeil
Local time: 15:36
Sep 20, 2012

Hi,

How does one check for fuzzy matches between files in Studio 2011?

I was hoping that the Internal Fuzzy Matching option would do the trick, but it doesn't seem to do cross file analysis.

This could be achieved in Workbench 2007 with the "Using a TM from a previous analysis" option.

As this is something I need to do on a regular basis, any ideas would be greatly appreciated.

Thanks.


 

SDL Community  Identity Verified
United Kingdom
Local time: 22:36
English
Maybe you don't have the options checked? Sep 21, 2012

Hi,

You are on the right lines, as the "Report internal fuzzy match leverage" is the function you want. But in Tools - Options - Language Pairs - All Language Pairs - Batch Processing - Analyze Files you will find an option called "Report Cross-file Repetitions" at the top. make sure this is checked and try again. If it's an existing Project make sure you do this under Project Settings and not Tools - Options.

Regards

Paul


 

rcorbeil
Local time: 15:36
TOPIC STARTER
That's what I thought too Sep 21, 2012

Hi,

Thank you for your reply.

I too thought that this would do the trick, but it doesn't. It only checks for repetitions.


 

rcorbeil
Local time: 15:36
TOPIC STARTER
Any other ideas? Sep 26, 2012

Any other ideas?

Thanks.


 

George Cook
United Kingdom
Local time: 21:36
French to English
+ ...
Not possible in Studio? Jul 2, 2013

I'd be very interested to hear about this as well. As rcorbeil says above, ticking "Report cross-file repetitions" does exactly what you would imagine; it reports identical sentences across all the files being analysed. "Report internal fuzzy match leverage" also does what you imagine, but on a file-by-file basis (ie., the internal fuzzy matches for each file independently).

There is, however, no apparent way to get Studio to report cross-file fuzzy matches (ie. a sentence appearing in the second file that is 97% identical to one in the first file), in the same way that the "Use TM from previous analysis" function does in Workbench.

This means that, in theory, you could have two files where the second is changed only by one word in every sentence, and running an analysis in Studio would lead you to believe that the files were entirely different, as none of the sentences were an exact match.

If there is a workaround to this, I'd be very grateful to hear it. I'd also be most interested to hear the thoughts of anyone from SDL, not only as regards the apparent lack of this function in Studio 2009/2011, but also as to the likelihood of it being reinstated in the next release.


 

David Terhart  Identity Verified
Germany
Local time: 22:36
English to German
+ ...
Workaround Sep 17, 2013

George Cook wrote:

If there is a workaround to this, I'd be very grateful to hear it.


If we have to do with several Word files, as in my current case, I think it should work to open all files in Word and copy the contents of each file into a new single file, and analyze that file.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Cross file analysis for fuzzy matches in Studio 2011

Advanced search







BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search