Extract repetitions from an excel document
Thread poster: Jorge22112002
Mar 17, 2008


I have an excel document that will be divided in four. This four documents have more new words that the original, because some repetitions are new words in the divided documents.

I'd like to extract all the repetitions from the originall document to get two file, one with the new one, and other one with the repetitions. Then I'll send to translate the repetitions document and the new one will be divided.

Do you know how can I do it?

Thank you very much in advance.



Direct link Reply with quote

Tjasa Kuerpick  Identity Verified
Local time: 20:57
Member (2006)
Slovenian to German
+ ...
How to deal with duplicates in Excel Mar 17, 2008

Well I don’t understand why you want to split a document finding out the repetitions, because if you work with Trados, he will store all those words which you have translated ones, and will not recognize them as new words in his TM.
Unless you want to know actually who many repetitions there are, but in this case you should note that Trados has another way of counting.

Before attempting, any changes to the original file always make a copy and leave the original file untouched for security reasons.

There is a way to eliminate the repetitions in the Excel file simple by filtering them out and clicking on the options no duplicates. Depending on how large your file is this might take some time for Excel too finish this job. He then marks all the cells (which should be preferably in one column) and you may now mark the whole table and copy it to the new destination (new excel table)
When copying the key combinations ALT+C and shortly after that ALT+X deletes from the original file all cells that have no duplicates.
When you return to the original file, you may now check the repetitions and store them into a second table.
BUT you should notice that Excel is sensible to spaces. What does this mean?
Let’s say, you have two cells with the same words in it:
Cell C13: Customer service
Cell E24: Customer service
Obviously these two files are duplicates, hence Excel does not recognize them as duplicates, because the second cell (E24) has a space after the word service, which the cell C13 does not. Thus, Excel will not touch these two cells and leave the duplicates in the original table.
As these spaces are often important you may not just delete them, as it might influence the final layout or any other feature or two words are then merged together and so on. If you want that Excel regardlessly of that recognizes them as duplicates you have to some VBA programming in Excel, where you can define how Excel should treat such examples.

After all this way of working is rather time intensive especially in a very big file and is advised only in cases where a good preparation is needed, to increase the quality of the translation or when two or more people have to work on this document because of a short deadline or to ensure that they all use the same terms or wording. Nevertheless, you should note as well that depending on the content some repetitions might have a completely different meaning if for example a part of the sentence was put in the next cell or in the words occur in combination with other words (which after all depends as well on the subject of your translation). It will be therefore important to check all those repetitions while translating wether they are proper or making sence to the the context.

Direct link Reply with quote
Extract repetitions from an excel document Mar 18, 2008


Thank you for your answer. It is a goog way to solve my problem.
The problem is because I have to send the divided files to diferent translator and I don't want that they translate de repetead segments in different ways.

Thank you very much for your help



Direct link Reply with quote

To report site rules violations or get help, contact a site moderator:

You can also contact site staff by submitting a support request »

Extract repetitions from an excel document

Advanced search

Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »

  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search