官术网_书友最值得收藏!

  • Deep Learning By Example
  • Ahmed Menshawy
  • 377字
  • 2021-06-24 18:52:40

Importing data with pandas

There are lots of libraries out there in Python that you can use to read, transform, or write data. One of these libraries is pandas (http://pandas.pydata.org/). Pandas is an open source library and has great functionality and tools for data analysis as well as very easy-to-use data structures.

You can easily get pandas in many different ways. The best way to get pandas is to install it via conda (http://pandas.pydata.org/pandas-docs/stable/install.html#installing-pandas-with-anaconda).

“conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.” – conda website.
You can easily get conda by installing Anaconda, which is an open data science platform.

So, let's have a look and see how to use pandas in order to read advertising data samples. First off, we need to import pandas:

import pandas as pd

Next up, we can use the pandas.read_csv method in order to load our data into an easy-to-use pandas data structure called DataFrame. For more information about pandas.read_csv and its parameters, you can refer to the pandas documentation for this method (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html):

# read advertising data samples into a DataFrame
advertising_data = pd.read_csv('http://www-bcf.usc.edu/~gareth/ISL/Advertising.csv', index_col=0)

The first argument passed to the pandas.read_csv method is a string value representing the file path. The string can be a URL that includes http, ftp, s3, and file. The second argument passed is the index of the column that will be used as a label/name for the data rows.

Now, we have the data DataFrame, which contains the advertising data provided in the URL and each row is labeled by the first column. As mentioned earlier, pandas provides easy-to-use data structures that you can use as containers for your data. These data structures have some methods associated with them and you will be using these methods to transform and/or operate on your data.

Now, let's have a look at the first five rows of the advertising data:

# DataFrame.head method shows the first n rows of the data where the   
# default value of n is 5, DataFrame.head(n=5)
advertising_data.head()

Output:

主站蜘蛛池模板: 乐昌市| 贵德县| 铜陵市| 侯马市| 襄汾县| 昌江| 沙雅县| 濮阳市| 含山县| 新蔡县| 嘉鱼县| 宜阳县| 洪洞县| 宜城市| 体育| 双桥区| 高淳县| 苏尼特左旗| 平乡县| 罗平县| 福建省| 新化县| 大埔区| 南溪县| 体育| 台南市| 鄂伦春自治旗| 蓬溪县| 阿拉善左旗| 资兴市| 南阳市| 巴东县| 博爱县| 龙岩市| 江陵县| 郓城县| 澄江县| 龙州县| 大洼县| 当涂县| 榆社县|