官术网_书友最值得收藏!

Summary

Decision trees are intuitive algorithms that are capable of performing classification and regression tasks. They allow users to print out their decision rules, which is a plus when communicating the decisions you made to business personnel and non-technical third parties. Additionally, decision trees are easy to configure since they have a limited number of hyperparameters. The two main decisions you need to make when training a decision tree are your splitting criterion and how to control the growth of your tree to have a good balance between overfitting and underfitting. Your understanding of the limitations of the tree's decision boundaries is paramount in deciding whether the algorithm is good enough for the problem at hand.

In this chapter, we looked at how decision trees learn and used them to classify a well-known dataset. We also learned about the different evaluation metrics and how the size of our data affects our confidence in a model's accuracy. We then learned how to deal with the evaluation's uncertainties using different data-splitting strategies. We saw how to tune the algorithm's hyperparameters for a good balance between overfitting and underfitting. Finally, we built on the knowledge we gained to build decision tree regressors and learned how the choice of a splitting criterion affects our resulting predictions.

I hope this chapter has served as a good introduction to scikit-learn and its consistent interface. With this knowledge at hand, we can move on to our next algorithm and see how it compares to this one. In the next chapter, we will learn about linear models. This set of algorithms has its roots back in the 18th century, and it is still one of the most commonly used algorithms today.

主站蜘蛛池模板: 新平| 靖江市| 宜宾县| SHOW| 麻栗坡县| 洛宁县| 宣武区| 宜城市| 抚宁县| 连州市| 盖州市| 泽普县| 沭阳县| 汝南县| 治多县| 哈巴河县| 金湖县| 武宁县| 疏附县| 中卫市| 敖汉旗| 杭锦后旗| 广州市| 巴里| 宜昌市| 武宣县| 五家渠市| 石林| 台州市| 安义县| 织金县| 桂阳县| 泉州市| 西乌珠穆沁旗| 滦平县| 任丘市| 宝兴县| 荣成市| 汪清县| 沅陵县| 香格里拉县|