官术网_书友最值得收藏!

Inputting data using Python

Similarly, we can use Python to retrieve the data, as shown in the code here:

import pandas as pd 
path="http://archive.ics.uci.edu/ml/machine-learning-databases/" 
dataset="iris/bezdekIris.data" 
inFile=path+dataset 
data=pd.read_csv(inFile,header=None) 
data.columns=["sepalLength","sepalWidth","petalLength","petalWidth","Class"] 

After retrieving data, the print(data.head(2)) function can be used to see the first two instances:

> print(data.head(2)) 
sepalLength sepalWidth petalLength petalWidth Class 0 5.1 3.5 1.4 0.2 Iris-setosa 1 4.9 3.0 1.4 0.2 Iris-setosa

When typing pd.read.csv(), we can find the definitions of all input variables, shown in the following screenshot. Again, to save space, only the first several input variables are shown:

To prevent a future potential change in terms of a dataset link, we have a backup dataset located at the author's website, shown in the following Python code:

inFile="http://canisius.edu/~yany/data/bezdekIris.data.txt" 
import pandas as pd 
d=pd.read_csv(inFile,header=None) 

The following table shows several functions included in the pandas package that we could use to retrieve data:

Table 3.4 Functions included in the Python pandas module for inputting data

To find out detailed information on each of the preceding functions, we use the help() function. For example, if we want to get more information about the read_sas() function, we issue the following commands:

import pandas as pd 
help(pd.read_sas) 

The corresponding output, the top part only, is shown here:

主站蜘蛛池模板: 荔浦县| 鹿泉市| 海伦市| 海阳市| 肃宁县| 定边县| 天全县| 九江市| 台湾省| 胶南市| 营口市| 瑞丽市| 龙游县| 钦州市| 涿州市| 揭西县| 碌曲县| 凌源市| 漳浦县| 浦城县| 淮阳县| 连州市| 天峨县| 岚皋县| 三原县| 若羌县| 江安县| 梁河县| 双峰县| 双流县| 修文县| 垦利县| 龙泉市| 顺平县| 罗平县| 夏邑县| 宿迁市| 天镇县| 平顺县| 崇左市| 庄河市|