Automatic extraction of individual html pages from a Website
Thread poster: Noemi Carrera

Noemi Carrera  Identity Verified
Spain
Local time: 15:06
Member (2003)
English to Spanish
May 9, 2006

Hi everyone,

I need to translate a Website that consists of lots of html pages. The client has not provided us with these pages, just with the .doc files.

I would prefer to work in TagEditor because there is a lot of formatting and tables everywhere and was wondering if there was any software that allows to extract automatically all the individual html pages from a Website.

Thank you very much in advance!

Best regards,

Noemí


 

Robert Tucker
United Kingdom
Local time: 14:06
German to English
+ ...
wget May 9, 2006

Originally written for Unix there are now Windows versions. You may want to search the net for a version you like the look of most; I found this one:

http://users.ugent.be/~bpuype/wget/

There is other software for the task, but I still find wget the easiest to use even though it is command line.


 

xxxtlmurray
Local time: 09:06
English
Acrobat, others May 9, 2006

Acrobat (Pro, at least) will dredge through an entire site and make a PDF of each page.

If you're fortunate to have a Mac, Webstractor (softchaos.com) pulls pages into a document that allows editing right there, sort of like viewing a page "in Word". There may be similar tools in Windows.

I noticed you said the client gave you the .doc files. Do you mean that the Web site is made from Word-to-Web, and you have the native docs? Because that sounds like you're home free for translating...


 

Maria Asis  Identity Verified
Spain
Local time: 15:06
Member (2002)
English to Spanish
+ ...
Try WinHTTrack Website Copier May 9, 2006

Hi!

I'm a great fan of WinHTTrack!

Find it here: http://www.httrack.com/

Luck!

MJ


 

Maria Asis  Identity Verified
Spain
Local time: 15:06
Member (2002)
English to Spanish
+ ...
May 9, 2006



[Edited at 2006-05-09 21:38]


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Automatic extraction of individual html pages from a Website

Advanced search






TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use SDL Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search