官术网_书友最值得收藏!

Loading data into memory – viewing and managing with ease using pandas

First, we will need to load data into memory so that Python can interact with it. Pandas will be our data management and manipulation library:

# load data into Pandas
import pandas as pd
df = pd.read_csv("./data/iris.csv")

Let's use some built-in pandas features to do sanity checks on our data load and make sure that we've loaded everything properly. First, we use the .shape attribute to check the size of the data printed (as rows and columns). Next, we sanity check the contents of the DataFrame with the .head() method, which returns the first five lines in a new and smaller DataFrame for easy viewing. Finally, we can use the .describe() method to show some summary statistics for each feature. 

Pandas has many more sanity check and quick view features. For example, .tail() will return the final five lines of the data. Becoming proficient in pandas is undoubtedly worth the time investment. The dedicated chapter that appears later in the book is a good place to start, as well as the essential basic functionality (https://pandas.pydata.org/pandas-docs/stable/getting_started/basics.html) page on the pandas documentation site.
# sanity check with Pandas
print("shape of data in (rows, columns) is " + str(df.shape))
print(df.head())
print(df.describe().transpose())

You will see the following output after executing the preceding code:

主站蜘蛛池模板: 宕昌县| 宾川县| 云梦县| 五台县| 建水县| 和龙市| 稻城县| 孟村| 多伦县| 庄河市| 安义县| 六安市| 洛浦县| 常熟市| 枣庄市| 黑河市| 中西区| 织金县| 乐山市| 萨迦县| 万载县| 科技| 洮南市| 日土县| 囊谦县| 芦溪县| 徐州市| 黑河市| 林甸县| 通许县| 珠海市| 锦州市| 富裕县| 芦溪县| 定南县| 德惠市| 丰都县| 炎陵县| 洪洞县| 且末县| 章丘市|