官术网_书友最值得收藏!

Summary

In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandas library to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis.

We saw the effect that good features have on performance and used an ensemble algorithm, Random forests, to further improve the accuracy.

In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use approximation to improve the scalability of data mining.

主站蜘蛛池模板: 扶风县| 探索| 三明市| 德阳市| 汽车| 五家渠市| 谢通门县| 海晏县| 和平区| 望都县| 积石山| 科技| 合水县| 榆林市| 石泉县| 遂溪县| 碌曲县| 共和县| 修武县| 平顶山市| 阳谷县| 福泉市| 竹山县| 丹凤县| 云霄县| 赣榆县| 玛多县| 镶黄旗| 安乡县| 苏尼特右旗| 乌拉特前旗| 阿鲁科尔沁旗| 舟曲县| 平顺县| 闽清县| 郴州市| 沅江市| 汕头市| 修文县| 大洼县| 安徽省|