- Python Web Scraping Cookbook
- Michael Heydt
- 107字
- 2021-06-30 18:44:03
Loading data in unicode / UTF-8
A document's encoding tells an application how the characters in the document are represented as bytes in the file. Essentially, the encoding specifies how many bits there are per character. In a standard ASCII document, all characters are 8 bits. HTML files are often encoded as 8 bits per character, but with the globalization of the internet, this is not always the case. Many HTML documents are encoded as 16-bit characters, or use a combination of 8- and 16-bit characters.
A particularly common form HTML document encoding is referred to as UTF-8. This is the encoding form that we will examine.
推薦閱讀
- Hands-On Industrial Internet of Things
- Proxmox High Availability
- 物聯(lián)網(wǎng)+BIM:構(gòu)建數(shù)字孿生的未來
- Go Web Scraping Quick Start Guide
- 計算機網(wǎng)絡(luò)工程實用教程(第2版)
- 城市治理一網(wǎng)統(tǒng)管
- 邁向自智網(wǎng)絡(luò)時代:IP自動駕駛網(wǎng)絡(luò)
- Working with Legacy Systems
- 計算機網(wǎng)絡(luò)技術(shù)及應(yīng)用
- Intelligent Mobile Projects with TensorFlow
- 趣話通信:6G的前世、今生和未來
- LiveCode Mobile Development Beginner's Guide
- OpenShift Cookbook
- SNS網(wǎng)站構(gòu)建
- 互聯(lián)網(wǎng)下一站:5G與AR/VR的融合