Can you download a whole website in one go?
Thread poster: Chris S

Chris S  Identity Verified
United Kingdom
Swedish to English
+ ...
Oct 26, 2011

I have been asked to check the English version of a bilingual website against the foreign version. The customer wants me to send my corrections as tracked changes in Word.

Is there an easy way to extract the text from all pages of the website into one or more Word files? Or do I have to go to every page individually, save it on my computer in html and then open it in Word?

Thanks for any ideas!


 

DZiW
Ukraine
English to Russian
+ ...
sure Oct 26, 2011

There're many alike but my fav is Teleport.

 

Michael Grant
Japan
Local time: 19:15
Japanese to English
A couple of things... Oct 26, 2011

You try using the HTTRack Website copier, here:

http://www.httrack.com/

Or, if you have the Windows commandline version of "wget" you can use that:
http://www.gnu.org/software/wget/

Both ways download the HTML files, which you would then have to open in Word...

There are other shareware and freeware applications out as well, just Google for "website downloader"...


 

Max Chernov
Russian Federation
Local time: 13:15
Russian to German
+ ...
It's many possibilities... Oct 26, 2011

That means, many programs, which let's to make the whole copy of a web-site...

Offline Explorer Pro, Teleport Pro, Webcopier...


 

David Wright  Identity Verified
Austria
Local time: 11:15
German to English
+ ...
Your client Oct 26, 2011

should have the files you need, rather than expecting you to download it yourself (as far as I know- but I'm no expert - you have to do it page by page)

 

Stanislaw Czech, MCIL  Identity Verified
United Kingdom
Local time: 10:15
Member (2006)
English to Polish
+ ...
You could use this program Oct 26, 2011

http://www.httrack.com/

It will download entire website, afterwards you can open required HTML files in MS Word.
Mind that it will download entire website not just the files you need.

Cheers
S


 

Chris S  Identity Verified
United Kingdom
Swedish to English
+ ...
TOPIC STARTER
Thanks Stanislav but... Oct 26, 2011

I tried HTtrack but gave up after an hour as it downloads everything, including big pdfs and images, and I only want the text!

 

Samuel Murray  Identity Verified
Netherlands
Local time: 11:15
Member (2006)
English to Afrikaans
+ ...
Specify Oct 26, 2011

Chris S wrote:
I tried HTtrack but gave up after an hour as it downloads everything, including big pdfs and images, and I only want the text!


Surely there is an option in the download task to specify what files should (or should not) be downloaded? You should also be able to set the crawl depth (how many subdirectories down) and whether files from other domains should be downloaded or not.


 

Chris S  Identity Verified
United Kingdom
Swedish to English
+ ...
TOPIC STARTER
Thanks Samuel Oct 26, 2011

You're right, and now I have the files!

But I still have to get them into Word, where every file is grey text on a black background and contains the whole menu and side bars and everything. If only people still used frames, eh?


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Can you download a whole website in one go?

Advanced search






SDL MultiTerm 2019
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2019 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2019 you can automatically create term lists from your existing documentation to save time.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search