Download Files from Webpages using wget

I was looking for downloading some pdf file from a webpage without right clicking each link and hit “save as”. The best open source way of doing that is by using “wget”. This is a command line way of downloading all files matching file extensions. If you’re looking for GUI way, use some browser addons like DownloadThemAll!

I prefer the commandline way so as to minimize the dependency on browser addons as it may become a memory hog later.

To Download pdf files from a web page, here is what you’ve to do:

    1. Install wget. Download wget from here and extract the tar file to your preferred directory. If you’re using Microsoft Windows, get the wget for windows (complete package) and install it.
    2. Add wget to OS path. Since this a commandline utility, you need to add wget to windows path (or your OS path).
      From commandline, run as shown in this example :
      set PATH=%PATH%;C:\Program Files (x86)\GnuWin32\bin
      The directory is the actual path of wget.exe file.
    3. Now, run wget from the folder you need to download files to as follows:
      • Open command prompt and cd (change directory)  to your download folder
      • Run wget -r -l 1 -nd -nH -A pdf http://example.com/pdftexts.php to download all pdf files from the webpage mentioned in the command. You can change file formats in the command

 


For more wget command help:

Wget Manual