Print Page - handy wget cheatsheet

Title: handy wget cheatsheet
Post by: littleman on June 03, 2012, 07:04:13 AM

http://notepad.benfinoradin.info/wp-content/uploads/2012/05/wget_cheat-sheet.pdf

Title: Re: handy wget cheatsheet
Post by: 4Eyes on June 04, 2012, 04:39:33 PM

nice - thanks :)

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 06, 2012, 12:39:24 PM

Out of interest, when wget mirrors a site does it find what to download by the links in the source code?
I take it won't download files like Google's file to confirm ownership?

Title: Re: handy wget cheatsheet
Post by: jetboy on June 06, 2012, 06:10:44 PM

Aye, it follows links in the HTML, just like a search engine spider. As such, orphaned files (such as Google's) will get missed out

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 06, 2012, 07:16:44 PM

Cheers, just thought I would double check :)

Title: Re: handy wget cheatsheet
Post by: littleman on June 06, 2012, 07:50:05 PM

Wget did a great job converting an old dynamic site of mine into static. It really saved me a ton of work. It ripped the site, saved it locally and converted all the link references so they'd work in static HTML.

Code Select


wget  --mirror -p --html-extension --base=./ -k -P ./ http://domain.com

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 19, 2012, 02:37:18 PM

Proper noob alert here.

I'm having a play with wget, i've installed it, i'm running it within the command line of windows, but where the hell does it save the files that it downloads?

Title: Re: handy wget cheatsheet
Post by: BoL on June 19, 2012, 02:38:37 PM

It saves in the directory you are running it from iirc, and the folder (at least when scraping a particular site), is the server name of the site.

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 19, 2012, 02:41:02 PM

that's what I thought, nothings there tho.

Hummm, more play time I think.

Thanks

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 19, 2012, 03:25:32 PM

Ha, found it.

It didn't save it to the folder I was in which was C:\Program Files (x86)\GnuWin32\bin
Instead it save it to C:\Users\username\AppData\Local\VirtualStore\Program Files (x86)\GnuWin32\bin

Strange that but never the less it's there :)

Title: Re: handy wget cheatsheet
Post by: JasonD on June 19, 2012, 10:52:13 PM

> GnuWin32\bin

That makes sense as its a Windows port of a Unix utility. It effectively emulates the *nix file system and bin is where BINaries are stored and the likely location of wget

Try using it at the command line and CDing to a different directory before invoking it with the full path to the file and command

EG.

cd c:\downloads\websites\google.com
C:\Users\username\AppData\Local\VirtualStore\Program Files (x86)\GnuWin32\bin\wget.exe --mirror -p --html-extension --base=./ -k -P ./ http://google.com

Title: Re: handy wget cheatsheet
Post by: Chunkford on June 20, 2012, 10:18:23 AM

Well I've learn't something new today!

That makes perfect sense now and the name VirtualStore gives it away a bit.

Cheers Jason

The Core

Why We Are Here => Hardware & Technology => Topic started by: littleman on June 03, 2012, 07:04:13 AM