官术网_书友最值得收藏!

Random forests

Random forests is a technique where you construct multiple trees, and then use those trees to learn the classification and regression models, but the results are aggregated from the trees to produce a final result.

Random forests are an ensemble of random, uncorrelated, and fully-grown decision trees. The decision trees used in the random forest model are fully grown, thus, having low bias and high variance. The trees are uncorrelated in nature, which results in a maximum decrease in the variance. By uncorrelated, we imply that each decision tree in the random forest is given a randomly selected subset of features and a randomly selected subset of the dataset for the selected features.

The original paper describing random forests is available at the following link:  https://www.stat.berkeley.edu/~breiman/randomforest2001.pdf .

The random forest technique does not reduce bias and as a result, has a slightly higher bias as compared to the individual trees in the ensemble. 

Random forests were invented by Leo Breiman and have been trademarked by Leo Breiman and Adele Cutler. More information is available at the following link:  https://www.stat.berkeley.edu/~breiman/RandomForests.

Intuitively, in the random forest model, a large number of decision trees are trained on different samples of data, that either fit or overfit. By averaging the individual decision trees, overfitting cancels out. 

Random forests seem similar to bagging, aka bootstrap aggregating, but they are different. In bagging, a random sample with replacement is selected to train every tree in the ensemble. The tree is trained on all the features. In random forests, the features are also sampled randomly, and at each candidate that is split, a subset of features is used to train the model.

For predicting values in case of regression problems, the random forest model averages the predictions from individual decision trees. For predicting classes in case of a classification problem, the random forest model takes a majority vote from the results of individual decision trees.

An interesting explanation of random forests can be found at the following link:  https://machinelearning-blog.com/2018/02/06/the-random-forest-algorithm/
主站蜘蛛池模板: 苍山县| 保靖县| 襄汾县| 肥西县| 宁武县| 思南县| 杭州市| 平安县| 新蔡县| 黑水县| 榆林市| 邮箱| 莱阳市| 和平区| 青冈县| 东台市| 淳化县| 尉犁县| 南木林县| 包头市| 海丰县| 五指山市| 天等县| 岑溪市| 灵宝市| 周口市| 龙井市| 榆社县| 体育| 曲周县| 阿瓦提县| 古蔺县| 滁州市| 方山县| 辰溪县| 曲阜市| 鞍山市| 门源| 昂仁县| 重庆市| 都安|