官术网_书友最值得收藏!

What you need for this book

All the code used in this book has been tested with Python 2.7, and is available for download at http://bitbucket.org/wswp/code. Ideally, in a future version of this book, the examples will be ported to Python 3. However, for now, many of the libraries required (such as Scrapy/Twisted, Mechanize, and Ghost) are only available for Python 2. To help illustrate the crawling examples, we created a sample website at http://example.webscraping.com. This website limits how fast you can download content, so if you prefer to host this yourself the source code and installation instructions are available at http://bitbucket.org/wswp/places.

We decided to build a custom website for many of the examples used in this book instead of scraping live websites, so that we have full control over the environment. This provides us stability—live websites are updated more often than books, and by the time you try a scraping example, it may no longer work. Also, a custom website allows us to craft examples that illustrate specific skills and avoid distractions. Finally, a live website might not appreciate us using them to learn about web scraping and try to block our scrapers. Using our own custom website avoids these risks; however, the skills learnt in these examples can certainly still be applied to live websites.

主站蜘蛛池模板: 迭部县| 都昌县| 绥芬河市| 沭阳县| 香港 | 正阳县| 绵竹市| 赣州市| 汝南县| 芦山县| 柳林县| 思南县| 塔城市| 当雄县| 措勤县| 乌兰县| 昌乐县| 英山县| 天全县| 普定县| 资兴市| 阳江市| 孝昌县| 蓬莱市| 开平市| 铁力市| 定州市| 广西| 西宁市| 洪江市| 商都县| 岑巩县| 鹤岗市| 平安县| 凌海市| 贡山| 田东县| 大城县| 郁南县| 民县| 西宁市|