Problems with searching in a very large Word document
Thread poster: BrianHayden

BrianHayden
United States
Russian to English
Feb 2, 2015

One of the main dictionaries that I use is a Word document. I add to it regularly, and it has grown to a size of 530,500 words on 50,500 lines, which comes out to 1349 pages. Lately it's been reacting slowly when I punch ctrl+F -- sometimes it just freezes for thirty seconds or more when I try to search it. I assume that this is do to the massive size of the file, but I could be wrong.

Would any of the following steps make the document "react" faster?:
-- I have all of the entries color-coded (blue means that I'm sure of my definition, peach means that I still need to search for a better equivalent, etc.). Does this add "information" in the document? Could I make things easier on Word if I made most of the entries plain black?
-- Would cutting some fat, i.e. reducing the raw word count, in the definitions make an improvement?
-- Keeping only two documents open at a time while using the dictionary document?

[Edited at 2015-02-02 11:15 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
Hi Brian, Feb 2, 2015

What format is it in? .doc? .docx? .rtf? Sometimes re-saving a .doc or .rtf file as .docx can greatly reduce its size and thus speed it up.

Another idea might be to move to a new format. A text file that size would open easily in e.g. EmEditor (a text editor). You'd have to convert your colour coding system to something new though. Perhaps mapping colours to asterisks (*, **, ***) or some other character…


 

Dan Lucas  Identity Verified
United Kingdom
Local time: 03:37
Member (2014)
Japanese to English
You're using the wrong tool Feb 2, 2015

BrianHayden wrote:
One of the main dictionaries that I use is a Word document. I add to it regularly, and it has grown to a size of 530,500 words on 50,500 lines, which comes out to 1349 pages.

Remember that it's a word processor, not a search engine! That's a huge file. Why not look into using something like TMLookup which is designed for this sort of thing and is free. Or there are many other tools out there. How much work time are you losing waiting for Word every day?

Dan


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 03:37
Member (2009)
Dutch to English
+ ...
Xbench is another interesting option Feb 2, 2015

You could also store your stuff in text files, or any number of other formats, and search it all using Xbench: http://www.xbench.net/

Xbench-file-types.png

[Edited at 2015-02-03 12:37 GMT]


 

Meta Arkadia
Local time: 10:37
English to Indonesian
+ ...
Possible solutions Feb 3, 2015

You're using the wrong tool, as Dan says. The trouble is, of course, that when you start building a glossary, you probably have no idea how things can get out of hand, so a Word file doesn't seem such a bad idea. Until now. What you can do:

You could try to split your huge Word file, and - if you still use Windows - buy dtSearch, to search the split files. It's not exactly free software, but I think dtSearch will show your colour codes in the search results.

You could also convert/save as your Word file into HTML. There shouldn't be a problem with the file size. If you then download and install DocFetcher - free, cross-platform - you should be able to perform fast searches, since DocFetcher indexes the files. As you can see below, I tried it, but with a very small original Word file converted to HTML only. Success not guaranteed.

Brian%20DocFetcher.png

This may only solve the search problem for this moment, I'm afraid. You really should find another solution. I think it can be found in the XML part of the Word file, where you should turn your colour codes into something that can be used in plain text files/database files, but easy it is not.

Cheers,

Hans


[Edited at 2015-02-03 03:00 GMT]


 

Rolf Keller
Germany
Local time: 04:37
English to German
Use WordViewer instead of Word Feb 3, 2015

BrianHayden wrote:

One of the main dictionaries that I use is a Word document.


On principle, for read-only text documents one should use the free WordViewer from Microsoft. As it is just a pure display tool, it is leaner, faster and doesn't touch/influence any docs you are viewing/editing at the same time.

And/or try .rtf format.

BTW 1, the first thing I do after installing Windows & Office: I change the targets of the Office file extensions (.doc, .docx and so on), so that clicking such a file opens the respective Viewer instead of the editing software.

BTW 2, Microsoft provides such viewers for Excel & Powerpoint as well.

BTW 3, how large is your dictionary file, compared to the RAM in your PC?

I add to it regularly


Any addition is stored as a "change", so make sure that the change history function is switched off. After adding new entries save/re-open the doc so that all changes get freezed.

Could I make things easier on Word if I made most of the entries plain black?


A try would take 1 minute ...


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Problems with searching in a very large Word document

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search