Automatic extraction of individual html pages from a Website
Thread poster: Noemi Carrera

Noemi Carrera  Identity Verified
Spain
Local time: 03:22
Member (2003)
English to Spanish
May 9, 2006

Hi everyone,

I need to translate a Website that consists of lots of html pages. The client has not provided us with these pages, just with the .doc files.

I would prefer to work in TagEditor because there is a lot of formatting and tables everywhere and was wondering if there was any software that allows to extract automatically all the individual html pages from a Website.

Thank you very much in advance!

Best regards,

Noemí


Direct link Reply with quote
 

Robert Tucker
United Kingdom
Local time: 02:22
German to English
+ ...
wget May 9, 2006

Originally written for Unix there are now Windows versions. You may want to search the net for a version you like the look of most; I found this one:

http://users.ugent.be/~bpuype/wget/

There is other software for the task, but I still find wget the easiest to use even though it is command line.


Direct link Reply with quote
 
xxxtlmurray
Local time: 21:22
English
Acrobat, others May 9, 2006

Acrobat (Pro, at least) will dredge through an entire site and make a PDF of each page.

If you're fortunate to have a Mac, Webstractor (softchaos.com) pulls pages into a document that allows editing right there, sort of like viewing a page "in Word". There may be similar tools in Windows.

I noticed you said the client gave you the .doc files. Do you mean that the Web site is made from Word-to-Web, and you have the native docs? Because that sounds like you're home free for translating...


Direct link Reply with quote
 
Maria Asis  Identity Verified
Spain
Local time: 03:22
Member (2002)
English to Spanish
+ ...
Try WinHTTrack Website Copier May 9, 2006

Hi!

I'm a great fan of WinHTTrack!

Find it here: http://www.httrack.com/

Luck!

MJ


Direct link Reply with quote
 
Maria Asis  Identity Verified
Spain
Local time: 03:22
Member (2002)
English to Spanish
+ ...
May 9, 2006



[Edited at 2006-05-09 21:38]


Direct link Reply with quote
 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Automatic extraction of individual html pages from a Website

Advanced search






BaccS – Business Accounting Software
Modern desktop project management for freelance translators

BaccS makes it easy for translators to manage their projects, schedule tasks, create invoices, and view highly customizable reports. User-friendly, ProZ.com integration, community-driven development – a few reasons BaccS is trusted by translators!

More info »
Déjà Vu X3
Try it, Love it

Find out why Déjà Vu is today the most flexible, customizable and user-friendly tool on the market. See the brand new features in action: *Completely redesigned user interface *Live Preview *Inline spell checking *Inline

More info »



Forums
  • All of ProZ.com
  • Term search
  • Jobs
  • Forums
  • Multiple search