官术网_书友最值得收藏!

Summary

In this chapter, we discussed fundamental concepts shared by almost any machine learning model. In the first part, we have introduced the data generating process, as a generalization of a finite dataset. We explained which are the most common strategies to split a finite dataset into a training block and a validation set, and we introduced cross-validation, with some of the most important variants, as one of the best approaches to avoid the limitations of a static split.

In the second part, we discussed the main properties of an estimator: capacity, bias, and variance. We also introduced the Vapnik-Chervonenkis theory, which is a mathematical formalization of the concept of representational capacity, and we analyzed the effects of high biases and high variances. In particular, we discussed effects called underfitting and overfitting, defining the relationship with high bias and high variance.

In the third part, we introduced the loss and cost functions, first as proxies of the expected risk, and then we detailed some common situations that can be experienced during an optimization problem. We also exposed some common cost functions, together with their main features. In the last part, we discussed regularization, explaining how it can mitigate the effects of overfitting.

In the next chapter, Chapter 2Introduction to Semi-Supervised Learning, we're going to introduce semi-supervised learning, focusing our attention on the concepts of transductive and inductive learning.

主站蜘蛛池模板: 南丹县| 韶山市| 安平县| 岳西县| 怀仁县| 勐海县| 松原市| 中阳县| 电白县| 驻马店市| 浮山县| 金塔县| 满洲里市| 轮台县| 蚌埠市| 德钦县| 阿图什市| 安泽县| 平阳县| 星子县| 含山县| 太仓市| 从化市| 邵武市| 巫溪县| 甘孜县| 城口县| 长岛县| 合山市| 长兴县| 平利县| 泸水县| 百色市| 陵川县| 托克逊县| 衡东县| 湖南省| 余姚市| 咸阳市| 微山县| 财经|