官术网_书友最值得收藏!

Getting ready

We will be using the planets data page and converting that data into CSV and JSON files. Let's start by loading the planets data from the page into a list of python dictionary objects. The following code (found in (03/get_planet_data.py) provides a function that performs this task, which will be reused throughout the chapter:

import requests
from bs4 import BeautifulSoup

def get_planet_data():
html = requests.get("http://localhost:8080/planets.html").text
soup = BeautifulSoup(html, "lxml")

planet_trs = soup.html.body.div.table.findAll("tr", {"class": "planet"})

def to_dict(tr):
tds = tr.findAll("td")
planet_data = dict()
planet_data['Name'] = tds[1].text.strip()
planet_data['Mass'] = tds[2].text.strip()
planet_data['Radius'] = tds[3].text.strip()
planet_data['Description'] = tds[4].text.strip()
planet_data['MoreInfo'] = tds[5].findAll("a")[0]["href"].strip()
return planet_data

planets = [to_dict(tr) for tr in planet_trs]

return planets

if __name__ == "__main__":
print(get_planet_data())

Running the script gives the following output (briefly truncated):

03 $python get_planet_data.py
[{'Name': 'Mercury', 'Mass': '0.330', 'Radius': '4879', 'Description': 'Named Mercurius by the Romans because it appears to move so swiftly.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Mercury_(planet)'}, {'Name': 'Venus', 'Mass': '4.87', 'Radius': '12104', 'Description': 'Roman name for the goddess of love. This planet was considered to be the brightest and most beautiful planet or star in the\r\n heavens. Other civilizations have named it for their god or goddess of love/war.', 'MoreInfo': 'https://en.wikipedia.org/wiki/Venus'}, {'Name': 'Earth', 'Mass': '5.97', 'Radius': '12756', 'Description': "The name Earth comes from the Indo-European base 'er,'which produced the Germanic noun 'ertho,' and ultimately German 'erde,'\r\n Dutch 'aarde,' Scandinavian 'jord,' and English 'earth.' Related forms include Greek 'eraze,' meaning\r\n 'on the ground,' and Welsh 'erw,' meaning 'a piece of land.'", 'MoreInfo': 'https://en.wikipedia.org/wiki/Earth'}, {'Name': 'Mars', 'Mass': '0.642', 'Radius': '6792', 'Description': 'Named by the Romans for their god of war because of its red, bloodlike color. Other civilizations also named this planet\r\n from this attribute; for example, the Egyptians named it "Her Desher," meaning "the red one."', 'MoreInfo':
...

It may be required to install csv, json and pandas.  You can do that with the following three commands:

pip install csv
pip install json
pip install pandas
主站蜘蛛池模板: 那坡县| 大田县| 荥经县| 武定县| 蒙自县| 乡城县| 宜春市| 宜都市| 达拉特旗| 东乌| 丹寨县| 伊金霍洛旗| 苍溪县| 鸡西市| 东宁县| 武义县| 黄石市| 奉节县| 临西县| 尖扎县| 西昌市| 巨野县| 分宜县| 新龙县| 咸宁市| 南和县| 榆林市| 六安市| 明星| 东莞市| 商都县| 衡阳县| 启东市| 云阳县| 昭觉县| 迁安市| 南陵县| 措美县| 岳阳市| 南漳县| 德兴市|