官术网_书友最值得收藏!

Defining your features

The second step in machine learning is defining your features. Think of features as components or attributes of the problem you wish to solve. In machine learning – specifically, when creating a new model – features are one of the biggest impacts on your model's performance. Properly thinking through your problem statement will promote an initial set of features that will drive differentiation between your dataset and model results. Going back to the Mayor example in the preceding section, what features would you consider data points for the citizen? Perhaps start by looking at the Mayor's competition and where he/she sits on issues in ways that differ from other candidates. These values could be turned into features and then made into a poll for citizens of John Doe County to answer. Using these data points would create a solid first pass at features. One aspect here that is also found in model building is running several iterations of feature engineering and model training, especially as your dataset grows. After model evaluation, feature importance is used to determine what features are actually driving your predictions. Occasionally, you will find that gut-instinct features can actually be inconsequential after a few iterations of model training and feature engineering.

In Chapter 11, Training and Building Production Models, we will deep dive into best practices when defining features and common approaches to complex problems to obtain a solid first pass at feature engineering.

主站蜘蛛池模板: 含山县| 中西区| 汉寿县| 平安县| 曲松县| 沾益县| 泸定县| 灵川县| 黄梅县| 隆化县| 清丰县| 乌鲁木齐市| 聂荣县| 江陵县| 马龙县| 连城县| 秦安县| 绥棱县| 木里| 砀山县| 安仁县| 遂川县| 广东省| 松滋市| 浮梁县| 吴川市| 天门市| 高雄市| 柘城县| 翼城县| 靖州| 女性| 辰溪县| 容城县| 峨眉山市| 江津市| 通河县| 弥勒县| 迁西县| 登封市| 乌拉特后旗|