官术网_书友最值得收藏!

Error measures

In general, when working with a supervised scenario, we define a non-negative error measure em which takes two arguments (expected and predicted output) and allows us to compute a total error value over the whole dataset (made up of n samples):

This value is also implicitly dependent on the specific hypothesis H through the parameter set, therefore optimizing the error implies finding an optimal hypothesis (considering the hardness of many optimization problems, this is not the absolute best one, but an acceptable approximation). In many cases, it's useful to consider the mean square error (MSE):

Its initial value represents a starting point over the surface of a n-variables function. A generic training algorithm has to find the global minimum or a point quite close to it (there's always a tolerance to avoid an excessive number of iterations and a consequent risk of overfitting). This measure is also called loss function because its value must be minimized through an optimization problem. When it's easy to determine an element which must be maximized, the corresponding loss function will be its reciprocal.

Another useful loss function is called zero-one-loss and it's particularly efficient for binary classifications (also for one-vs-rest multiclass strategy):

This function is implicitly an indicator and can be easily adopted in loss functions based on the probability of misclassification.

A helpful interpretation of a generic (and continuous) loss function can be expressed in terms of potential energy:

The predictor is like a ball upon a rough surface: starting from a random point where energy (=error) is usually rather high, it must move until it reaches a stable equilibrium point where its energy (relative to the global minimum) is null. In the following figure, there's a schematic representation of some different situations:

Just like in the physical situation, the starting point is stable without any external perturbation, so to start the process, it's needed to provide initial kinetic energy. However, if such an energy is strong enough, then after descending over the slope the ball cannot stop in the global minimum. The residual kinetic energy can be enough to overcome the ridge and reach the right valley. If there are not other energy sources, the ball gets trapped in the plain valley and cannot move anymore. There are many techniques that have been engineered to solve this problem and avoid local minima. However, every situation must always be carefully analyzed to understand what level of residual energy (or error) is acceptable, or whether it's better to adopt a different strategy. We're going to discuss some of them in the next chapters.

主站蜘蛛池模板: 马鞍山市| 綦江县| 合江县| 政和县| 灌南县| 云林县| 渑池县| 吴忠市| 贵南县| 大同县| 安义县| 九龙县| 荣昌县| 旌德县| 长子县| 榆中县| 文化| 凤台县| 灌南县| 靖安县| 淳安县| 吉木萨尔县| 三亚市| 保靖县| 江北区| 津南区| 东乌| 深州市| 海伦市| 葵青区| 定兴县| 来宾市| 清远市| 深州市| 平舆县| 景谷| 福鼎市| 秀山| 文山县| 上蔡县| 普安县|