WORD: Alliteration VBA script
Thread poster: DZiW

DZiW
Ukraine
English to Russian
+ ...
Apr 24

Dear colleagues,

I got a task to check text for two-letter alliterations (marking them light-green) and here is a hurry-skurry yet working example:

Code:
Sub Alliterations()
N = ThisDocument.Words.Count: If N < 10 Then Exit Sub' if very short, then over
On Error Resume Next 'safety
Application.ScreenUpdating = False 'speeding up a little
i = 1

With ThisDocument
Do 'MAIN LOOP

Do 'seek the FIRST 2-letter pair at the beginning of the word
was = UCase(Left(.Words(i), 2))
i = i + 1
Loop Until (i < N) And (Left(was, 1) Like "[A-ZА-Я]")

it = UCase(Left(.Words(i), 2)) 'get the NEXT 2-letters from the NEXT word

While (it = was) And (i < N) 'and mark all matches
.Words(i).Characters(1).Font.ColorIndex = wdBrightGreen
.Words(i).Characters(2).Font.ColorIndex = wdBrightGreen
'.UndoClear
i = i + 1
DoEvents 'check for Ctrl+ScrollLock
was = UCase(Left(.Words(i), 2))
Wend

'If Application.NumLock = True Then Exit Sub 'a terminator flag
If i Mod 100 = 0 Then Application.StatusBar = i & "/" & N - i: DoEvents
Loop Until i > = N
End With 'ThisDocument

Beep
MsgBox "Done"
End Sub



The hardware is no issue--a Corei7 desktop with 8GB RAM running MS Word 2013 under W8.1 x64, with a peak load under 30% and no intense HDD activity.

The document is a 25K-word DOCX (OOXML).

I ran the script, which did some 5000 words promptly, then started to slow down on and off without apparent reasons, steadily taking about 20% CPU time. As far as neither old DOC, nor RTF type changed a thing, nor Affinity/Priority, I thought it was something to do with Undo (see remarked UndoClear), but seems a red-herring too. I tried other plain texts with the same result.

Of course, the script is far from perfection and non-optimized, but I'm really missing something important... Any ideas?
Even better, if you know a software to mark alliterations at the beginning)

Thank you


 

John Fossey  Identity Verified
Canada
Local time: 12:00
Member (2008)
French to English
DoEvents? Apr 25

Possibly the way DoEvents is being used?

See for example: http://www.fmsinc.com/microsoftaccess/modules/examples/avoiddoevents.asp


 

Hans Lenting  Identity Verified
Netherlands
Member (2006)
German to Dutch
Regex? Apr 25

Not a software but a regex? ((aa)[a-z]*\s(aa)[a-z]*)|((bb)[a-z]*\s(bb)[a-z]*) etc.

Or: \b(\w)\1\w*\s(\w)\1\w*

https://regex101.com/r/fDuRWj/1

[Edited at 2018-04-25 04:56 GMT]


 

Rolf Keller
Germany
Local time: 18:00
English to German
String space is being filled up? Apr 25

DZiW wrote:

I'm really missing something important... Any ideas?

First of all you should delete the "On Error Resume Next" because this statement might hide some important error messages.

Maybe you use the string heap too extensively, so that VBA's garbage collector gets much work.
https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)
https://stackoverflow.com/questions/19038350/when-should-an-excel-vba-variable-be-killed-or-set-to-nothing
Anyway, your loops put tens of thousands strings into the size-limited string heap. As soon as the heap gets full, you'll experience a slowdown.

VBA doesn't provide programmatic access to the garbage collector but there are some crutches:
https://blogs.msdn.microsoft.com/ericlippert/2004/04/28/when-are-you-required-to-set-objects-to-nothing/

In case VBA isn't able to do the job, you could use VSTO. Or you could rename .docx to .zip, unzip it, look for the .xml that contains the document's content and then process the .xml file programmatically. Both methods will require some time to get them working, though. And VSTO is not free, AFAIK.


 

DZiW
Ukraine
English to Russian
+ ...
TOPIC STARTER
Hello Apr 25

John, I know it works faster without DoEvents, but retains the same pattern--doing first ~5K words at ease and hobbling down, let alone it doesn't break/exit the subroutine using Ctrl+ScrLk, alas.

Hans, it's interesting, yet seems MS Word uses some other RegEx, right?


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

WORD: Alliteration VBA script

Advanced search






SDL Trados Studio 2017 only €435 / $519
Get the cheapest prices for SDL Trados Studio 2017 on ProZ.com

Join this translator’s group buy brought to you by ProZ.com and buy SDL Trados Studio 2017 Freelance for only €435 / $519 / £345 / ¥63000 You will also receive FREE access to Studio 2019 when released.

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search