Suggest a tool for downloading entire websites (besides WinHTTrack and wget)
Thread poster: Artem Vakhitov

Artem Vakhitov  Identity Verified
Estonia
English to Russian
+ ...
Dec 19, 2013

Dear colleagues, I need to translate a huge website, and need a tool to download the entire site in a form most convenient for translation. (It's not guaranteed that I'll get the sources from the CMS, whatever it is. I know it sucks, but...)

EasyLing service is too expensive for my purposes. WinHTTrack is too slow with its artificial limitations and user-unfriendly; I've read that you can improve speed somewhat, but you have to jump though hoops for that. Wget (in the form of VisualWget) is fast, but saves JSP files instead of HTML (probably I just don't know how to setup it properly for this particular purpose).

Can you suggest me some efficient, user-friendly, and preferably free (or inexpensive) software or service that I can use for translating large websites? Alternatively, can you show me how to configure (Visual)Wget?

Thank you in advance!


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 10:16
Member (2009)
Dutch to English
+ ...
WinHTTrack is fast for me Dec 19, 2013

Hi Artem,

I have never noticed WinHTTrack being slow. On the contrary, I have actually downloaded very large sites, with thousands of pages, quite quickly.

Michael


Direct link Reply with quote
 

Artem Vakhitov  Identity Verified
Estonia
English to Russian
+ ...
TOPIC STARTER
It is unfortunately slow for me Dec 19, 2013

Michael Beijer wrote:

Hi Artem,

I have never noticed WinHTTrack being slow. On the contrary, I have actually downloaded very large sites, with thousands of pages, quite quickly.

Michael


It is however slow for me About 25 KiB/s is not good.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 11:16
Member (2006)
English to Afrikaans
+ ...
@Artem -- it's a safety setting, so disable it Dec 19, 2013

Artem Vakhitov wrote:
It is however slow for me About 25 KiB/s is not good.


When you set up the download project, click the "Set options" button, and then click the Limits tab. Then set the dropdown box at "Max transfer rate" to nothing (or to something higher).


Direct link Reply with quote
 

Artem Vakhitov  Identity Verified
Estonia
English to Russian
+ ...
TOPIC STARTER
Helped somewhat, but still slowish Dec 19, 2013

Samuel Murray wrote:

Artem Vakhitov wrote:
It is however slow for me About 25 KiB/s is not good.


When you set up the download project, click the "Set options" button, and then click the Limits tab. Then set the dropdown box at "Max transfer rate" to nothing (or to something higher).


Thank you. After I set the Limit to nothing, I got this:

Transfer rate: 0..100KiB/s (18KiB/s)

The first value fluctuated (from 0 to about 100), so I give an average. The second increased slowly.

After several minutes the first value started fluctuating around 3KiB/s, so it looks like it all got even slower after a short speed-up.


Direct link Reply with quote
 

Michael Joseph Wdowiak Beijer  Identity Verified
United Kingdom
Local time: 10:16
Member (2009)
Dutch to English
+ ...
testing Dec 19, 2013

I just set the 'max transfer rate' to 50000 B/s, and am getting sustained speeds of 40.00–50.00 KiB/s, even after 10 minutes of downloading.

Michael


Direct link Reply with quote
 

Artem Vakhitov  Identity Verified
Estonia
English to Russian
+ ...
TOPIC STARTER
Could not achieve the same Dec 19, 2013

Michael Beijer wrote:

I just set the 'max transfer rate' to 50000 B/s, and am getting sustained speeds of 40.00–50.00 KiB/s, even after 10 minutes of downloading.

Michael


I could not get it to sustain that speed. Anyhow, I've finally managed to download the site.


Direct link Reply with quote
 

Alex Lago  Identity Verified
Spain
Local time: 11:16
Member (2009)
English to Spanish
+ ...
Does the website download quickly on your browser? Dec 19, 2013

Have you tried accessing the website directly? Does the website download quickly on your browser when you click on different pages?

I've never had speed problems with WinHTTrack and I've used it to download some massive websites (over 1000 pages) could the problem be on the server side?


Direct link Reply with quote
 

Michael Wetzel  Identity Verified
Germany
Local time: 11:16
German to English
other options Dec 20, 2013

I was looking for something similar and came across CatsCradle and Caterpillar from Stormdance (http://www.stormdance.net/). Both are inexpensive and seemed to be just what I was looking for based on the provider's descriptions. They've also been discussed here several times in the forums, but not for a while. I'd be interested if anyone has anything new to say about them.

I also came across webbudget XT, which was more expensive and seemed to be basically a CAT specially designed for dealing well with website translations. (This has also been discussed here before, mostly in the same threads as CatsCradle.)


Direct link Reply with quote
 

Artem Vakhitov  Identity Verified
Estonia
English to Russian
+ ...
TOPIC STARTER
I've found how to unlock the speed in WinHTTrack GUI Dec 21, 2013

Thank you all who replied in the topic! In case anybody else encounters a speed problem in WinHTTrack, here's what you have to do:

1. In Mirror -> Modify Options, select Scan Rules tab.

2. Append "--disable-security-limits" (without quotes) to the text in the large text field on this tab.

3. Select Limits tab.

4. In Max Transfer Rate combo box, select empty line or enter 0.

This bypasses security limits which you do at your own risk.

[Edited at 2013-12-21 14:12 GMT]

[Edited at 2013-12-21 14:12 GMT]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Suggest a tool for downloading entire websites (besides WinHTTrack and wget)

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
SDL MultiTerm 2017
Guarantee a unified, consistent and high-quality translation with terminology software by the industry leaders.

SDL MultiTerm 2017 allows translators to create one central location to store and manage multilingual terminology, and with SDL MultiTerm Extract 2017 you can automatically create term lists from your existing documentation to save time.

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search