官术网_书友最值得收藏!

Deciding whether to train outdoors depending on the weather

Let's suppose we have historical data on the decisions made by an experienced football trainer about training outdoors (outside the gym) or not with her team, including the weather conditions on the days when the decisions were made.

A typical dataset could look as follows:

The dataset was specifically created for this example and, of course, might not represent any real decisions.

In this example, the target variable is Train outside and the rest of the variables are the model features.

According to the data table, a possible decision tree would be as follows:

We choose to start splitting the data by the value of the Outlook feature. We can see that if the value is Overcast, then the decision to train outside is always Yes and does not depend on the values of the other features. Sunny and Rainy can be further split to get an answer. 

How can we decide which feature to use first and how to continue? We will use the value of the entropy, measuring how much its value changes when considering different input features.

主站蜘蛛池模板: 昌宁县| 桃源县| 花垣县| 呼伦贝尔市| 金秀| 中西区| 历史| 江口县| 华宁县| 新郑市| 桦川县| 海晏县| 潼南县| 徐州市| 古交市| 易门县| 蚌埠市| 深泽县| 万州区| 德令哈市| 祁门县| 余庆县| 裕民县| 永嘉县| 开封市| 重庆市| 本溪市| 通许县| 彝良县| 涿鹿县| 漳平市| 凤冈县| 益阳市| 成武县| 林芝县| 临江市| 蒙城县| 抚顺市| 泰顺县| 恭城| 台山市|