Use GNU Wget
Terminal
wget http://(website-name).org/
* exchange brackets and "website-name" for name of site downloading.
Wget can download all images and other data nested within the site and linked from top page. Use;
wget -r http://(website-name).org/
If a site refuses to allow you to do this and try to detect if you are using a browser or not. There is a -U option to identify Wget as one. Use;
wget -r -p -U Mozilla http://www.stupidsite.com/restricedplace.html
To prevent being blacklisted for downloading the site use;
--wait=20 (example is 20 seconds wait between getting each retrieval)
--limit-rate=20K (Limit the rate at which you download set in bits so add K to make it realise you want KB/s
EG.
wget --wait=20 --limit-rate=20K -r -p -U Mozilla http://www.stupidsite.com/restricedplace.html
To make sure you do not download contents of folders nested below that which you are downloading;
--no-parent
No comments:
Post a Comment