- Python Web Scraping Cookbook
- Michael Heydt
- 111字
- 2021-06-30 18:44:12
How it works
In the constructor for URLUtility, there is a call to urlib.parse.urlparse. The following demonstrates using the function interactively:
>>> parsed = urlparse(const.ApodEclipseImage())
>>> parsed
ParseResult(scheme='https', netloc='apod.nasa.gov', path='/apod/image/1709/BT5643s.jpg', params='', query='', fragment='')
The ParseResult object contains the various components of the URL. The path element contains the path and the filename. The call to the .filename_without_ext property returns just the filename without the extension:
@property
def filename_without_ext(self):
filename = os.path.splitext(os.path.basename(self._parsed.path))[0]
return filename
The call to os.path.basename returns only the filename portion of the path (including the extension). os.path.splittext() then separates the filename and the extension, and the function returns the first element of that tuple/list (the filename).
推薦閱讀
- Building E-commerce Sites with VirtueMart Cookbook
- Learning QGIS 2.0
- Learning Karaf Cellar
- Spring Cloud微服務架構進階
- 局域網組建、管理與維護項目教程(Windows Server 2003)
- PLC、現場總線及工業網絡實用技術速成
- 智慧光網絡:關鍵技術、應用實踐和未來演進
- Microsoft Dynamics CRM 2011 Applications(MB2-868) Certification Guide
- 高級網絡技術
- Hands-On Microservices with Node.js
- 深入理解Nginx:模塊開發與架構解析
- 云工廠:開啟中國制造云時代
- 華為HCIA-Datacom認證指南
- Hands-On Docker for Microservices with Python
- 智能物聯網:區塊鏈與霧計算融合應用詳解