Removing duplicates from a termbase
Thread poster: Jacques DP

Jacques DP  Identity Verified
Switzerland
Local time: 00:29
Member (2003)
English to French
Nov 6, 2006

Hello,

I have a termbase in MultiTerm 7, and there are many duplicate entries, by which I mean identical pairs (same source term, same translation).

Is there a way to remove such duplicate entries (keeping just one instance)?

By the way, this termbase is an import of the new MS terminology at .

Thanks,

Jacques


Direct link Reply with quote
 

Jacques DP  Identity Verified
Switzerland
Local time: 00:29
Member (2003)
English to French
TOPIC STARTER
Here is the URL that didn't get through in the previous post Nov 6, 2006

http://www.microsoft.com/globaldev/tools/MILSGlossary.mspx

Direct link Reply with quote
 
Ulrich Roos
Local time: 00:29
German
You may search for duplicate terms Nov 10, 2006

Hi Jaques,

unfortunately you cannot search for duplicate entries.

However, it might be useful to search for duplicate terms by opening the menu "Search" and clicking on "Search for duplicate terms". This command searches through your current source language and lists all terms that occur more than once. You will have to browse through that list but at least you don't have to work your way through your entire database.

I hope this helps.

Best,

Ulrich


Direct link Reply with quote
 

Jacques DP  Identity Verified
Switzerland
Local time: 00:29
Member (2003)
English to French
TOPIC STARTER
How I solved it Nov 10, 2006

Dear Ulrich,

Thanks for your answer. I saw this, but there were too many duplicate entries, and deleting them manually, even having the list of duplicate terms, was not feasible.

Since I imported the termbase from an Excel file, I reasoned that the problem would be easier to solve within Excel. (I am surprised, though, that the MultiTerm importing process doesn't offer the option of removing duplicate entries, since they are generally useless.)

Having verified that it couldn't be done through the menus in Excel, and not feeling like coding the Visual Basic script myself, I googled for it and found it here: http://www.softplatz.com/Soft/Business/Office-Suites-Tools/Excel-Unique-Duplicate-Data-Remover.html

It's shareware, but the free version will do the trick (choose Duplicate > Duplicate Row Wizard).

The only price is the risk of installing something of unknown origin: it may contain a virus, spyware, or whatever (in fact, though it's just a macro, it comes as an executable to install the macro...). Use at your own risk.

(Since the search query I used in Google was not overly specific, this means that the site where I found the macro had a good Google pagerank, which in turns makes it likely that no virus are posted there. But this is just a quick guess, not a guarantee.)

Best,

Jacques


Direct link Reply with quote
 

Vito Smolej
Germany
Local time: 00:29
Member (2004)
English to Slovenian
+ ...
What I would do... Nov 10, 2006

is create a pivot table in Excel to remove duplicates. I know it borders on obscene, but then again...

btw, how did you manage to crate so many duplicates? using the same XML file to import (that's my source of doubles)?

Regards

smo


Direct link Reply with quote
 

Jacques DP  Identity Verified
Switzerland
Local time: 00:29
Member (2003)
English to French
TOPIC STARTER
Answering your question Nov 11, 2006

Hi Vito,

How did I get the duplicates in the first place: As reported in my messages above, I downloaded the new MS glossary (see URL above). Then, I only kept English and French (it's a multilingual glossary). If you do that, you will find that there is an enormous number of duplicates. Common words can have up to 10 occurrences (with the same translation). It may be because the same term has sometimes been translated differently in other languages, so that these rows are really not duplicates in the complete glossary.

Anyway, it's solved now.

Thanks

Jacques


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Removing duplicates from a termbase

Advanced search







PerfectIt consistency checker
Faster Checking, Greater Accuracy

PerfectIt helps deliver error-free documents. It improves consistency, ensures quality and helps to enforce style guides. It’s a powerful tool for pro users, and comes with the assurance of a 30-day money back guarantee.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search