官术网_书友最值得收藏!

Downloading a page for offline analysis with Wget

Wget is a part of the GNU project and is included in most of the major Linux distributions, including Kali Linux. It has the ability to recursively download a web page for offline browsing, including conversion of links and downloading of non-HTML files.

In this recipe, we will use Wget to download pages that are associated with an application in our vulnerable_vm.

Getting ready

All recipes in this chapter will require vulnerable_vm running. In the particular scenario of this book, it will have the IP address 192.168.56.102.

How to do it...

  1. Let's make the first attempt to download the page by calling Wget with a URL as the only parameter:
    wget http://192.168.56.102/bodgeit/
    

    As we can see, it only downloaded the index.html file to the current directory, which is the start page of the application.

  2. We will have to use some options to tell Wget to save all the downloaded files to a specific directory and to copy all the files contained in the URL that we set as the parameter. Let's first create a directory to save the files:
    mkdir bodgeit_offline
    
  3. Now, we will recursively download all files in the application and save them in the corresponding directory:
    wget -r -P bodgeit_offline/ http://192.168.56.102/bodgeit/
    

How it works...

As mentioned earlier, Wget is a tool created to download HTTP content. With the –r parameter we made it act recursively, which is to follow all the links in every page it downloads and download them too. The -P option allows us to set the directory prefix, which is the directory where Wget will start saving the downloaded content; it is set to the current path, by default.

There's more...

There are some other useful options to be considered when using Wget:

  • -l: When downloading recursively, it might be necessary to establish limits to the depth Wget goes to, when following links. This option, followed by the number of levels of depth we want to go to, lets us establish such a limit.
  • -k: After files are downloaded, Wget modifies all the links to make them point to the corresponding local files, thus making it possible to browse the site locally.
  • -p: This option lets Wget download all the images needed by the page, even if they are on other sites.
  • -w: This option makes Wget wait the number of seconds specified after it between one download and the next. It's useful when there is a mechanism to prevent automatic browsing in the server.
主站蜘蛛池模板: 东丽区| 枞阳县| 汕尾市| 石柱| 开原市| 东乡族自治县| 新竹市| 绥宁县| 惠安县| 武清区| 芒康县| 祁阳县| 习水县| 怀远县| 博客| 昭觉县| 二连浩特市| 勐海县| 临清市| 穆棱市| 钟祥市| 保山市| 达尔| 临朐县| 余庆县| 新晃| 汽车| 搜索| 彰化市| 朝阳县| 南涧| 曲水县| 阳新县| 通海县| 绩溪县| 昔阳县| 灵武市| 全南县| 阳泉市| 鹤庆县| 阳高县|