官术网_书友最值得收藏!

Relationships between variables

We will now look at a scatterplot matrix, to see the relationships between some of these variables. A scatterplot matrix is a very useful function to use, because it can tell us whether a linear classifier will be a good classifier for our data, or whether we have to investigate more complicated methods.

We will add a scatter_matrix method and adjust the size to figsize(18, 18), to make it easier to see.

The output, as shown in the following screenshot, indicates the relationship between each variable and every other variable:

All of the variables are listed on both the x and the y axes. Where they intersect, we can see the histograms that we saw previously.

In the block indicated by the mouse cursor in the preceding screenshot, we can see that there is a pretty strong linear relationship between uniform_cell_shape and uniform_cell_sizeThis is expected. When we go through the preceding screenshot, we can see that some other cells have a good linear relationship. If we look at our classifications, however, there's no easy way to classify these relationships.

In class in the preceding screenshot, we can see that 4 is a malignant classification. We can also see that there are cells that are scored from 1 to 10 on clump_thickness, and were still classified as malignant.

Thus, we come to the conclusion that there aren't any strong relationships between any of the variables of our dataset.

主站蜘蛛池模板: 库尔勒市| 六枝特区| 黑河市| 肇庆市| 比如县| 台南县| 中江县| 大石桥市| 赞皇县| 汉中市| 武山县| 雅安市| 金坛市| 阿图什市| 临澧县| 五寨县| 波密县| 昌邑市| 延庆县| 汝州市| 长顺县| 肃南| 安丘市| 拉萨市| 卢氏县| 石楼县| 黎川县| 育儿| 濉溪县| 湖南省| 合水县| 吉安市| 会同县| 亚东县| 怀仁县| 凤阳县| 顺义区| 城口县| 宜阳县| 大丰市| 霞浦县|