Is it possible to extract the ALL the contents of a web site?
Thread poster: Elena Miguel

Elena Miguel  Identity Verified
Spain
Local time: 10:13
English to Spanish
+ ...
Oct 24, 2005

I have to quote the translation of a web site for a regular client who has not designed it, and thus, he cannot provide me with the files.
Is it possible to extract ALL the contents of a web site?
The site is full of pages and subpages and it is nearly impossible to do it "manually"...
Thank you in advance.


Direct link Reply with quote
 

RWSTranslation
Germany
Local time: 10:13
Member (2007)
German to English
+ ...
Maybe Oct 24, 2005

Hello,

maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.

You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.

Hans


Direct link Reply with quote
 

Roberto Tokuda  Identity Verified
Local time: 05:13
Member (2005)
Japanese to Spanish
+ ...
utility Oct 24, 2005

You can use webstripper to download all files of the web page(s) and related links into your computer

http://webstripper.net/

Regards


Direct link Reply with quote
 

Gerard de Noord  Identity Verified
France
Local time: 10:13
Member (2003)
German to Dutch
+ ...
No it's impossible Oct 24, 2005

DSC wrote:

Hello,

maybe it is possible if you find a way to see all the text. If you have dynamic created pages, you will often see only some of the text.

You can try to use tools like offline explorer to save a website locally. But you can only save the information which would be sent by the Webserver according to the setting and questions of your web browser.

Hans


For the same reasons Hans says maybe I say no. You and your client can't be sure that the site is 100% HTML and only in that case you can spunge it successfully with the tools Hans mentioned. And even then you can't be sure that the HTML code isn't altered during the process.

Don't do it. I did it once some years ago and I regretted it.

Regards,
Gerard


Direct link Reply with quote
 

Heinrich Pesch  Identity Verified
Finland
Local time: 11:13
Member (2003)
Finnish to German
+ ...
If you get the password for ftp Oct 25, 2005

...you can transfer all files concerned, translate them and be sure all is ok. Otherwise it's a risky thing to do. I wonder why someone wants to translate a site without having access to it.
Regards
Heinrich


Direct link Reply with quote
 

Elena Miguel  Identity Verified
Spain
Local time: 10:13
English to Spanish
+ ...
TOPIC STARTER
Thank you Oct 25, 2005

Thank you everybody!
I finally managed to extract most of the files with stripper and hope this is enough to create a "tentative" quote which is the only thing they need for the moment.
Regards.


Direct link Reply with quote
 

Samuel Murray  Identity Verified
Netherlands
Local time: 10:13
Member (2006)
English to Afrikaans
+ ...
No, it is not Oct 25, 2005

Delelis wrote:
Is it possible to extract ALL the contents of a web site?


No. What the web server serves you is based on what user agent you make the request with.

The site is full of pages and subpages and it is nearly impossible to do it "manually"...


You could use a download manager such as Getleft to download as much of the site as possible.

But even if the client doesn't have the original content... doesn't he have his own internet connection? Can't he download the site himself? He's probably too lazy, yes? Then your option is to download it yourself and send it to him (zipped) and ask him to indicate whether those pages are the pages he wants translated. Get the client to indicate exactly which pages he wants translated.


Direct link Reply with quote
 

Gwidon Naskrent  Identity Verified
Poland
Local time: 10:13
English to Polish
+ ...
Another solution Nov 11, 2005

If the site is dynamically generated, perhaps it could be possible to coax the client into providing you with the php source files. Or is there a tool suited to translating php contents?

Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Is it possible to extract the ALL the contents of a web site?

Advanced search






Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search