官术网_书友最值得收藏!

Various methods of importing data in Python

pandas is the Python library/package of choice to import, wrangle, and manipulate datasets. The datasets come in various forms; the most frequent being in the .csv format. The delimiter (a special character that separates the values in a dataset) in a CSV file is a comma. Now we will look at the various methods in which you can read a dataset in Python.

Case 1 – reading a dataset using the read_csv method

Open an IPython Notebook by typing ipython notebook in the command line.

Download the Titanic dataset from the shared Google Drive folder (any of .xls or .xlsx would do). Save this file in a CSV format and we are good to go. This is a very popular dataset that contains information about the passengers travelling on the famous ship Titanic on the fateful sail that saw it sinking. If you wish to know more about this dataset, you can go to the Google Drive folder and look for it.

A common practice is to share a variable description file with the dataset describing the context and significance of each variable. Since this is the first dataset we are encountering in this book, here is the data description of this dataset to get a feel of how data description files actually look like:

Note
VARIABLE DESCRIPTIONS:
pclass          Passenger Class
                (1 = 1st; 2 = 2nd; 3 = 3rd)
survival        Survival
                (0 = No; 1 = Yes)
name            Name
sex             Sex
age             Age
sibsp           Number of Siblings/Spouses Aboard
parch           Number of Parents/Children Aboard
ticket          Ticket Number
fare            Passenger Fare
cabin           Cabin
embarked        Port of Embarkation
                (C = Cherbourg; Q = Queenstown; S = Southampton)
boat            Lifeboat
body            Body Identification Number
home.dest       Home/Destination

The following code snippet is enough to import the dataset and get you started:

 import pandas as pd
 data = pd.read_csv('E:/Personal/Learning/Datasets/Book/titanic3.csv')
主站蜘蛛池模板: 浪卡子县| 绩溪县| 禄丰县| 安乡县| 瑞金市| 大邑县| 陆川县| 永州市| 洪洞县| 清原| 福泉市| 叙永县| 确山县| 民权县| 枞阳县| 集贤县| 临澧县| 双鸭山市| 汤原县| 集安市| 容城县| 淮阳县| 永泰县| 休宁县| 龙南县| 海原县| 南郑县| 调兵山市| 刚察县| 松阳县| 晋中市| 牡丹江市| 金山区| 将乐县| 固安县| 明水县| 北京市| 抚顺市| 铁岭市| 上思县| 南昌市|