International Translation Day 2018

Join ProZ.com/TV for a FREE event September 25-26th celebrating International Translation Day! 50+ hours of content, Chat, Live Q&A & more. Join 1,000's of linguists from around the globe as ProZ.com/TV celebrates International Translation Day.

Click for Full Participation

Find only repeated words
Thread poster: Samuel Murray

Samuel Murray  Identity Verified
Netherlands
Local time: 17:51
Member (2006)
English to Afrikaans
+ ...
May 22

Hello everyone

Using MS Word, is there a search string or a macro that will find and/or highlight all repeated words? I know that repeated words are flagged during spell-check but I want to look for ONLY repeated words. By "repeated words" I mean words that repeat next to each other, e.g. "the" in "paris in the the spring".

Alternatively, what would the search strings be in regular expressions in a regex capable text editor?

Thanks
Samuel


 

Thomas T. Frost  Identity Verified
Member (2014)
Danish to English
+ ...
I would use Excel May 22

I would split the whole Word file into one word per line (using the Replace function: " " to "^p" to change blanks to line breaks), then copy and paste the whole thing into Excel, where you can use a formula such as "if(A1=A2;"REPEATED";"") and copy it down.

You may need to remove some blank lines first. You could start by numbering all the Excel lines (use copy series down). Then you can sort on the text column, remove the blank lines, then sort again on the line numbers to get the text back in the original order.

Does it make sense?


 

Philip Lees  Identity Verified
Greece
Local time: 18:51
Member (2008)
Greek to English
Regex for repeats May 23

Samuel Murray wrote:

Alternatively, what would the search strings be in regular expressions in a regex capable text editor?



This works in perl:

$_ = 'Paris in the the spring is a wonderful wonderful time';

print "$1\n" while /([a-zA-Z]{2,})\s+\1/g;

The output is:

the
wonderful

This only catches repeats of groups of at least two alphabetical characters, separated by at least one space.

You may be able to tweak it if it's not quite right.

The $1 represents the matched string and in contexts other than perl may need to be replaced by \1, which also represents the repeated part in the regex.

The g flag at the end forces it to search the entire string for matches, rather than returning the first match over and over again.


 

Philip Lees  Identity Verified
Greece
Local time: 18:51
Member (2008)
Greek to English
And in Word May 23

Samuel Murray wrote:

Using MS Word, is there a search string or a macro that will find and/or highlight all repeated words?


Well whaddyaknow? My regex also works in Word with a few tweaks.

Search for:

([a-zA-Z]{2,}) @\1

with "Use wildcards" checked.

I didn't know Word's search function could handle repeat matches like that.


 

Samuel Murray  Identity Verified
Netherlands
Local time: 17:51
Member (2006)
English to Afrikaans
+ ...
TOPIC STARTER
Yes, pleasant surprise May 23

Philip Lees wrote:
Well whaddyaknow? My regex also works in Word with a few tweaks.
I didn't know Word's search function could handle repeat matches like that.


(-:


 

Philip Lees  Identity Verified
Greece
Local time: 18:51
Member (2008)
Greek to English
And now ... (off topic) May 23

Samuel Murray wrote:

(-:


If you think that's cute, have a look at this:

http://neilk.net/blog/2000/06/01/abigails-regex-to-test-for-prime-numbers/


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Find only repeated words

Advanced search






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search