書名： Mastering Predictive Analytics with scikit：learn and TensorFlow
作者名： Alan Fontaine
本章字數： 232字
更新時間： 2021-07-23 16:42:23

Ensemble methods and their working

Ensemble methods are based on a very simple idea: instead of using a single model to make a prediction, we use many models and then use some method to aggregate the predictions. Having different models is like having different points of view, and it has been demonstrated that by aggregating models that offer a different point of view; predictions can be more accurate. These methods further improve generalization over a single model because they reduce the risk of selecting a poorly performing classifier:

In the preceding diagram, we can see that each object belongs to one of three classes: triangles, circles, and squares. In this simplified example, we have two features to separate or classify the objects into the different classes. As you can see here, we can use three different classifiers and all the three classifiers represent different approaches and have different kinds of decision boundaries.

Ensemble learning combines all those individual predictions into a single one. The predictions made from combining the three boundaries usually have better properties than the ones produced by the individual models. This is the simple idea behind ensemble methods, also called ensemble learning.

The most commonly used ensemble methods are as follows:

Bootstrap sampling
Bagging
Random forests
Boosting

Before giving a high-level explanation of these methods, we need to discuss a very important statistical technique known as bootstrap sampling.

官术网_书友最值得收藏!

Mastering Predictive Analytics with scikit：learn and TensorFlow

Ensemble methods and their working