1. Open the HTML files in Word.
2. Save them as .txt files. (This strips all standard HTML tags.)
3. Use a CAT tool to analyze the .txt files (i.e., count the words against the contents of a TM).
4. Voilà, you get the word count of all changes (i.e., words that aren't yet contained in the TM).
Thanks for the suggestion. The only problem I see is, in our case, we have approx. 4,000 html files so it would be VERY time consuming to save each file in .txt format and then analyze each of them.