- Python Web Scraping Cookbook
- Michael Heydt
- 181字
- 2021-06-30 18:44:02
How it works
XPath is a element of the XSLT (eXtensible Stylesheet Language Transformation) standard and provides the ability to select nodes in an XML document. HTML is a variant of XML, and hence XPath can work on on HTML document (although HTML can be improperly formed and mess up XPath parsing in those cases).
XPath itself is designed to model the structure of XML nodes, attributes, and properties. The syntax provides means of finding items in the XML that match the expression. This can include matching or logical comparison of any of the nodes, attributes, values, or text in the XML document.
Understanding XPath is essential for knowing how to parse HTML and perform web scraping. And as we will see, it underlies, and provides an implementation for, many of the higher level libraries such as lxml.
- 自動駕駛網(wǎng)絡(luò):自智時(shí)代的網(wǎng)絡(luò)架構(gòu)
- Mastering Machine Learning for Penetration Testing
- 高校網(wǎng)絡(luò)道德教育研究
- 重新定義Spring Cloud實(shí)戰(zhàn)
- TCP/IP入門經(jīng)典(第5版)
- WordPress 5 Complete
- Yii Application Development Cookbook(Second Edition)
- 通信原理及MATLAB/Simulink仿真
- 計(jì)算機(jī)網(wǎng)絡(luò)技術(shù)及應(yīng)用
- Dart Cookbook
- 精通SEO:100%網(wǎng)站流量提升密碼
- 現(xiàn)場綜合化網(wǎng)絡(luò)運(yùn)營與維護(hù):運(yùn)營商數(shù)字化轉(zhuǎn)型技術(shù)與實(shí)踐
- 數(shù)字王國里的虛擬人:技術(shù)、商業(yè)與法律解讀
- Enterprise ApplicationDevelopment with Ext JSand Spring
- 移動互聯(lián)網(wǎng)環(huán)境下的核心網(wǎng)剖析及演進(jìn)