官术网_书友最值得收藏!

Generalization/true error

This is the second and more important type of error in data science. The whole purpose of building learning systems is the ability to get a smaller generalization error on the test set; in other words, to get the model to work well on a set of observation/samples that haven't been used in the training phase. If you still consider the class scenario from the previous section, you can think of generalization error as the ability to solve exam problems that weren’t necessarily similar to the problems you solved in the classroom to learn and get familiar with the subject. So, generalization performance is the model's ability to use the skills (parameters) that it learned in the training phase in order to correctly predict the outcome/output of unseen data.

In Figure 13, the light blue line represents the generalization error. You can see that as you increase the model complexity, the generalization error will be reduced, until some point when the model will start to lose its increasing power and the generalization error will decrease. This part of the curve where you get the generalization error to lose its increasing generalization power, is called overfitting.

The takeaway message from this section is to minimize the generalization error as much as you can.

主站蜘蛛池模板: 海安县| 大名县| 乐安县| 民勤县| 鄯善县| 平山县| 大足县| 垫江县| 长治市| 增城市| 宜昌市| 尚义县| 凤冈县| 成安县| 东方市| 孝感市| 海城市| 多伦县| 醴陵市| 乃东县| 潞西市| 外汇| 深水埗区| 安吉县| 北碚区| 商丘市| 民县| 文化| 桑植县| 德钦县| 宜黄县| 彝良县| 新密市| 新宾| 日照市| 区。| 中江县| 喀什市| 河源市| 新龙县| 雅安市|