官术网_书友最值得收藏!

Summary

In this chapter, we explored the fundamental ideas surrounding issues and concerns with data quality and how to categorize quality issues by their type, as well as presented ideas for tidying up your data.

In order to compare the performance of the different models that one may create, we went on to establish some fundamental notions of model performance, such as the mean squared error (MSE) for regression and the classification error rate for classification.

We also introduced cross-validation as a generic assessment technique to be used in cases where there is a limited amount of data available.

Finally, learning curves were discussed as a way to judge the ability of a model to improve its scores or ability to learn.

With a firm grounding in the basics of the predictive modeling process, we will look at linear regression in the next chapter.

主站蜘蛛池模板: 新和县| 香港| 铁岭县| 潍坊市| 黔东| 甘南县| 闸北区| 崇左市| 陵水| 新巴尔虎右旗| 抚宁县| 克拉玛依市| 宁阳县| 梧州市| 海口市| 麦盖提县| 台前县| 卓尼县| 和林格尔县| 吴堡县| 化德县| 柘城县| 贡山| 四平市| 安陆市| 屏东市| 化德县| 秭归县| 临沭县| 湟中县| 仁化县| 浙江省| 泾川县| 宜宾县| 平塘县| 晋江市| 望谟县| 新龙县| 肃宁县| 永顺县| 疏勒县|