官术网_书友最值得收藏!

Summary

In this chapter, we extended our use of scikit-learn's classifiers to perform classification and introduced the pandas library to manage our data. We analyzed real-world data on basketball results from the NBA, saw some of the problems that even well-curated data introduces, and created new features for our analysis.

We saw the effect that good features have on performance and used an ensemble algorithm, Random forests, to further improve the accuracy.

In the next chapter, we will extend the affinity analysis that we performed in the first chapter to create a program to find similar books. We will see how to use algorithms for ranking and also use approximation to improve the scalability of data mining.

主站蜘蛛池模板: 莱芜市| 铜川市| 枣阳市| 循化| 大邑县| 恩施市| 滦平县| 嘉禾县| 兰州市| 宁武县| 天峻县| 枣强县| 乐清市| 阳西县| 安塞县| 白山市| 安图县| 承德市| 莆田市| 滨州市| 东明县| 廉江市| 沈丘县| 万年县| 乐安县| 大同市| 高淳县| 腾冲县| 新化县| 洪雅县| 仁化县| 乌拉特后旗| 通化县| 留坝县| 鄱阳县| 小金县| 万源市| 宁津县| 辽宁省| 合水县| 神农架林区|