官术网_书友最值得收藏!

Loading the dataset

We can again thank scikit-learn for easy access to the dataset. We first import all the necessary modules, as we did earlier:

In [1]: import numpy as np
... from sklearn import datasets
... from sklearn import metrics
... from sklearn import model_selection as modsel
... from sklearn import linear_model
... %matplotlib inline
... import matplotlib.pyplot as plt
... plt.style.use('ggplot')

Then, loading the dataset is a one-liner:

In [2]: boston = datasets.load_boston()

The structure of the boston object is identical to the iris object, as discussed in the preceding command. We can get more information about the dataset in 'DESCR', find all data in 'data', all feature names in 'feature_names', and all target values in 'target':

In [3]: dir(boston)
Out[3]: ['DESCR', 'data', 'feature_names', 'target']

The dataset contains a total of 506 data points, each of which has 13 features:

In [4]: boston.data.shape
Out[4]: (506, 13)

Of course, we have only a single target value, which is the housing price:

In [5]: boston.target.shape
Out[5]: (506,)
主站蜘蛛池模板: 台北市| 邵东县| 长子县| 荥经县| 广西| 两当县| 图片| 大港区| 肇州县| 图们市| 定安县| 台中县| 杂多县| 洪雅县| 四川省| 衡阳市| 巫山县| 赤峰市| 邢台县| 德钦县| 慈利县| 汾阳市| 上饶县| 乌拉特中旗| 徐州市| 贡山| 封丘县| 西乌珠穆沁旗| 库尔勒市| 泰来县| 大关县| 高阳县| 富锦市| 嘉善县| 雷州市| 都匀市| 怀柔区| 稷山县| 新泰市| 全州县| 道孚县|