官术网_书友最值得收藏!

Loading data from files into a DataFrame

The pandas library provides facilities for easy retrieval of data from a variety of data sources as pandas objects. As a quick example, let's examine the ability of pandas to load data in CSV format.

This example will use a file provided with the code from this book, data/goog.csv, and the contents of the file represent time series financial information for the Google stock.

The following statement uses the operating system (from within Jupyter Notebook or IPython) to display the content of this file. Which command you will need to use depends on your operating system:

This information can be easily imported into a DataFrame using the pd.read_csv() function:

pandas has no idea that the first column in the file is a date and has treated the contents of the date field as a string. This can be verified using the following pandas statement, which shows the type of the Date column as a string:

The parse_dates parameter of the pd.read_csv() function to guide pandas on how to convert data directly into a pandas date object. The following informs pandas to convert the content of the Date column into actual TimeStamp objects:

If we check whether it worked, we see that the date is a Timestamp:

Unfortunately, this has not used the date field as the index for the data frame. Instead, it uses the default zero-based integer index labels:

Note that this is now a RangeIndex, where in previous versions of pandas it would have been an integer index. We'll examine this difference later in the book.

This can be fixed using the index_col parameter of the pd.read_csv() function to specify which column in the file should be used as the index:

And the index now is a DateTimeIndex, which lets us look up rows using dates.

主站蜘蛛池模板: 新龙县| 阿拉善盟| 九台市| 隆安县| 通许县| 隆回县| 建瓯市| 淮阳县| 凯里市| 布尔津县| 巴马| 浦县| 博爱县| 东台市| 黑河市| 营口市| 海南省| 百色市| 屯留县| 翼城县| 金平| 汤原县| 三明市| 武城县| 上犹县| 景德镇市| 平乐县| 精河县| 南充市| 当雄县| 淮南市| 安新县| 南澳县| 招远市| 莱州市| 林西县| 元谋县| 武穴市| 新闻| 福安市| 股票|