官术网_书友最值得收藏!

Defining your features

The second step in machine learning is defining your features. Think of features as components or attributes of the problem you wish to solve. In machine learning – specifically, when creating a new model – features are one of the biggest impacts on your model's performance. Properly thinking through your problem statement will promote an initial set of features that will drive differentiation between your dataset and model results. Going back to the Mayor example in the preceding section, what features would you consider data points for the citizen? Perhaps start by looking at the Mayor's competition and where he/she sits on issues in ways that differ from other candidates. These values could be turned into features and then made into a poll for citizens of John Doe County to answer. Using these data points would create a solid first pass at features. One aspect here that is also found in model building is running several iterations of feature engineering and model training, especially as your dataset grows. After model evaluation, feature importance is used to determine what features are actually driving your predictions. Occasionally, you will find that gut-instinct features can actually be inconsequential after a few iterations of model training and feature engineering.

In Chapter 11, Training and Building Production Models, we will deep dive into best practices when defining features and common approaches to complex problems to obtain a solid first pass at feature engineering.

主站蜘蛛池模板: 仲巴县| 奎屯市| 太白县| 乌兰浩特市| 晴隆县| 万荣县| 商丘市| 泰州市| 肃宁县| 本溪| 资中县| 海安县| 略阳县| 尉氏县| 思茅市| 义乌市| 丹凤县| 丰镇市| 遂川县| 平武县| 昔阳县| 怀化市| 宣武区| 旅游| 建德市| 钟山县| 宁陕县| 平武县| 泸州市| 灵川县| 邹城市| 宣汉县| 九龙城区| 丽水市| 长汀县| 通化市| 饶河县| 阜新市| 大邑县| 新泰市| 齐齐哈尔市|