官术网_书友最值得收藏!

Visualization is a good first step

Datasets later in the book will grow to thousands of features. With only four features in our starting example, we can easily plot all two-dimensional projections on a single page and build predictions, which can then be extended to large datasets with many more features. As we saw in Chapter 3, Regression, visualizations are excellent in the initial exploratory phase of the analysis as they allow you to learn the general features of your problem as well as catch problems that occurred with data collection early.

Each subplot in the following plot shows all points projected into two of the dimensions. The outlying group (triangles) are the Iris Setosa plants, while Iris Versicolor plants are in the center (circle) and Iris Virginica are plotted with x marks. We can see that there are two large groups. One is of Iris Setosa and another is a mixture of Iris Versicolor and Iris Virginica:

Here is the code to load the dataset (you can find the plotting code in the online repository):

from sklearn.datasets import load_iris 
data = load_iris() 
features = data.data 
feature_names = data.feature_names 
target = data.target 
target_names = data.target_names 
labels = target_names[target] 
主站蜘蛛池模板: 桑日县| 双城市| 安乡县| 八宿县| 惠安县| 鄂伦春自治旗| 汕尾市| 云浮市| 丹凤县| 鄂州市| 河池市| 高尔夫| 金沙县| 云南省| 台安县| 桦南县| 兴安县| 当雄县| 宁晋县| 夏河县| 石台县| 大宁县| 庆阳市| 桃园县| 丽水市| 武功县| 义乌市| 诸暨市| 乐都县| 台北县| 宁乡县| 平和县| 甘德县| 砀山县| 台东市| 岳阳市| 海门市| 常宁市| 宁南县| 兴隆县| 西充县|