Problems with searching in a very large Word document
Thread poster: BrianHayden

BrianHayden
United States
Russian to English
Feb 2, 2015

One of the main dictionaries that I use is a Word document. I add to it regularly, and it has grown to a size of 530,500 words on 50,500 lines, which comes out to 1349 pages. Lately it's been reacting slowly when I punch ctrl+F -- sometimes it just freezes for thirty seconds or more when I try to search it. I assume that this is do to the massive size of the file, but I could be wrong.

Would any of the following steps make the document "react" faster?:
-- I have all of the entries color-coded (blue means that I'm sure of my definition, peach means that I still need to search for a better equivalent, etc.). Does this add "information" in the document? Could I make things easier on Word if I made most of the entries plain black?
-- Would cutting some fat, i.e. reducing the raw word count, in the definitions make an improvement?
-- Keeping only two documents open at a time while using the dictionary document?

[Edited at 2015-02-02 11:15 GMT]


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 09:23
Member (2009)
Dutch to English
+ ...
Hi Brian, Feb 2, 2015

What format is it in? .doc? .docx? .rtf? Sometimes re-saving a .doc or .rtf file as .docx can greatly reduce its size and thus speed it up.

Another idea might be to move to a new format. A text file that size would open easily in e.g. EmEditor (a text editor). You'd have to convert your colour coding system to something new though. Perhaps mapping colours to asterisks (*, **, ***) or some other character…


 

Dan Lucas  Identity Verified
United Kingdom
Local time: 09:23
Member (2014)
Japanese to English
You're using the wrong tool Feb 2, 2015

BrianHayden wrote:
One of the main dictionaries that I use is a Word document. I add to it regularly, and it has grown to a size of 530,500 words on 50,500 lines, which comes out to 1349 pages.

Remember that it's a word processor, not a search engine! That's a huge file. Why not look into using something like TMLookup which is designed for this sort of thing and is free. Or there are many other tools out there. How much work time are you losing waiting for Word every day?

Dan


 

Michael Beijer  Identity Verified
United Kingdom
Local time: 09:23
Member (2009)
Dutch to English
+ ...
Xbench is another interesting option Feb 2, 2015

You could also store your stuff in text files, or any number of other formats, and search it all using Xbench: http://www.xbench.net/

Xbench-file-types.png

[Edited at 2015-02-03 12:37 GMT]


 

Meta Arkadia
Local time: 15:23
English to Indonesian
+ ...
Possible solutions Feb 3, 2015

You're using the wrong tool, as Dan says. The trouble is, of course, that when you start building a glossary, you probably have no idea how things can get out of hand, so a Word file doesn't seem such a bad idea. Until now. What you can do:

You could try to split your huge Word file, and - if you still use Windows - buy dtSearch, to search the split files. It's not exactly free software, but I think dtSearch will show your colour codes in the search results.

You could also convert/save as your Word file into HTML. There shouldn't be a problem with the file size. If you then download and install DocFetcher - free, cross-platform - you should be able to perform fast searches, since DocFetcher indexes the files. As you can see below, I tried it, but with a very small original Word file converted to HTML only. Success not guaranteed.

Brian%20DocFetcher.png

This may only solve the search problem for this moment, I'm afraid. You really should find another solution. I think it can be found in the XML part of the Word file, where you should turn your colour codes into something that can be used in plain text files/database files, but easy it is not.

Cheers,

Hans


[Edited at 2015-02-03 03:00 GMT]


 

Rolf Keller
Germany
Local time: 10:23
English to German
Use WordViewer instead of Word Feb 3, 2015

BrianHayden wrote:

One of the main dictionaries that I use is a Word document.


On principle, for read-only text documents one should use the free WordViewer from Microsoft. As it is just a pure display tool, it is leaner, faster and doesn't touch/influence any docs you are viewing/editing at the same time.

And/or try .rtf format.

BTW 1, the first thing I do after installing Windows & Office: I change the targets of the Office file extensions (.doc, .docx and so on), so that clicking such a file opens the respective Viewer instead of the editing software.

BTW 2, Microsoft provides such viewers for Excel & Powerpoint as well.

BTW 3, how large is your dictionary file, compared to the RAM in your PC?

I add to it regularly


Any addition is stored as a "change", so make sure that the change history function is switched off. After adding new entries save/re-open the doc so that all changes get freezed.

Could I make things easier on Word if I made most of the entries plain black?


A try would take 1 minute ...


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Problems with searching in a very large Word document

Advanced search






SDL Trados Studio 2019 Freelance
The leading translation software used by over 250,000 translators.

SDL Trados Studio 2019 has evolved to bring translators a brand new experience. Designed with user experience at its core, Studio 2019 transforms how new users get up and running, helps experienced users make the most of the powerful features, ensures new

More info »
Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search