官术网_书友最值得收藏!

Understanding supervised learning with decision trees

The decision tree algorithm uses a tree-like model of decisions. Its name is derived from the graphical representation of the cascading process that partitions the records. The algorithm chooses the input variables that better split the dataset into subsets that are more pure in terms of the target variable, ideally a subset that contains only one value of this variable. Decision trees are some of the most widely used and easy to understand classification algorithms. 

The outcome of the tree algorithm calculation is a set of simple rules that explain which values or intervals of the input values split the original data better. The fact that the results and the path followed to get to them can be clearly shown gives decision trees an advantage over other algorithms. Explainability is a serious problem for some machine learning and artificial intelligence systems – which are mostly used as black boxes – and is a study subject in itself.

In complex problems, we need to decide when to stop the tree development. A large number of features can lead to a very large and complex tree, so the number of branches and the length of the tree are usually limited by the user. 

Entropy is a very important concept in decision trees and the way of quantifying the purity of each subsample. It measures the amount of information contained in each leaf of the tree. The lower the entropy, the larger the amount of information. Zero entropy means that a subset contains only one value of the target variable, while a value of one represents a subset that contains the same amount of both values. This concept will be explained later with examples.

Entropy is an indicator of how messy your data is.

Using the entropy that's calculated in every step, the algorithm chooses the best variable to split the data and recursively repeats the same procedure. The user can decide how to stop the calculation, either when all subsets have an entropy of zero, when there are no more features to split by, or a minimum entropy level.

The input features that are best suited for use in a decision tree are the categorical ones. In case of a continuous, numerical variable, it should be first converted into categories by dividing it into ranges; for example, A > 0.5 would be A1 and A ≤ 0.5 would be A2.

Let's look at an example that explains the concept of the decision tree algorithm.

主站蜘蛛池模板: 手机| 巴里| 肃宁县| 纳雍县| 白河县| 普宁市| 南康市| 海门市| 贺兰县| 宣城市| 鄂伦春自治旗| 松阳县| 东乌| 桂林市| 和静县| 嵊泗县| 金阳县| 承德县| 宜章县| 巧家县| 鸡东县| 襄城县| 西充县| 且末县| 安康市| 长子县| 安化县| 莒南县| 双流县| 绥化市| 福安市| 越西县| 光山县| 鹤山市| 重庆市| 盐池县| 德阳市| 衡山县| 东莞市| 武平县| 平山县|