官术网_书友最值得收藏!

What is a decision tree?

Decision trees are a family of non-parametric supervised learning methods. In the decision tree algorithm, we start with the complete dataset and split it into two partitions based on a simple rule. The splitting continues until a specified criterion is met. The nodes at which the split is made are called interior nodes and the final endpoints are called terminal or leaf nodes.

As an example, let us look at the following tree:

Here, we are assuming that the exoplanet data has only two properties: flux.1 and flux.2. First, we make a decision if flux.1 > 400 and then divide the data into two partitions. Then we divide the data again based on flux.2 feature, and that division decides whether the planet is an exoplanet or not. How did we decide that condition flux.1 > 400? We did not. This was just to demonstrate a decision tree. During the training phase, that's what the model learns – the parameters of conditions that divide the data into partitions.

For classification problems, the decision tree has leaf nodes that shows the result as the discrete classification of the data and for regression problems, the leaf nodes show the results as a predicted number. Decision trees, thus, are also popularly known as Classification and Regression Trees (CART).

主站蜘蛛池模板: 曲阜市| 临江市| 长乐市| 双桥区| 嘉鱼县| 榆树市| 宜良县| 将乐县| 余江县| 吉安市| 陆河县| 大洼县| 民县| 清涧县| 饶河县| 高唐县| 工布江达县| 峨眉山市| 虎林市| 长汀县| 临清市| 江西省| 永康市| 兰考县| 太仓市| 吉安县| 柳河县| 乐业县| 仙游县| 永顺县| 吴堡县| 静安区| 定襄县| 镇康县| 闸北区| 兖州市| 威远县| 临汾市| 阳信县| 江阴市| 翁牛特旗|