- Python Web Scraping Cookbook
- Michael Heydt
- 218字
- 2021-06-30 18:44:11
Introduction
A common practice in scraping is the download, storage, and further processing of media content (non-web pages or data files). This media can include images, audio, and video. To store the content locally (or in a service like S3) and do it correctly, we need to know what the type of media is, and it's not enough to trust the file extension in the URL. We will learn how to download and correctly represent the media type based on information from the web server.
Another common task is the generation of thumbnails of images, videos, or even a page of a website. We will examine several techniques of how to generate thumbnails and make website page screenshots. Many times these are used on a new website as thumbnail links to the scraped media that is now stored locally.
Finally, it is often the need to be able to transcode media, such as converting non-MP4 videos to MP4, or changing the bit-rate or resolution of a video. Another scenario is to extract only the audio from a video file. We won't look at video transcoding, but we will rip MP3 audio out of an MP4 file using ffmpeg. It's a simple step from there to also transcode video with ffmpeg.
- 微商之道
- Aptana Studio Beginner's Guide
- 物聯網(IoT)基礎:網絡技術+協議+用例
- Learning QGIS 2.0
- 面向云平臺的物聯網多源異構信息融合方法
- Wireshark網絡分析就這么簡單
- 城域網與廣域網(第2版)
- 夢工廠之材質N次方:Maya材質手冊
- 網絡利他行為研究:積極心理學的視角
- 一本書讀懂TCP/IP
- Guide to NoSQL with Azure Cosmos DB
- Corona SDK Application Design
- 智能物聯安防視頻技術基礎與應用
- Hands-On Cloud:Native Microservices with Jakarta EE
- 從實踐中學習Kali Linux網絡掃描