官术网_书友最值得收藏!

Regularization

When a model is ill-conditioned or prone to overfitting, regularization offers some valid tools to mitigate the problems. From a mathematical viewpoint, a regularizer is a penalty added to the cost function, so to impose an extra-condition on the evolution of the parameters:

The parameter λ controls the strength of the regularization, which is expressed through the function g(θ). A fundamental condition on g(θ) is that it must be differentiable so that the new composite cost function can still be optimized using SGD algorithms. In general, any regular function can be employed; however, we normally need a function that can contrast the indefinite growth of the parameters.

To understand the principle, let's consider the following diagram:

Interpolation with a linear curve (left) and a parabolic one (right)

In the first diagram, the model is linear and has two parameters, while in the second one, it is quadratic and has three parameters. We already know that the second option is more prone to overfitting, but if we apply a regularization term, it's possible to avoid the growth of a (first quadratic parameter), transforming the model into a linearized version. Of course, there's a difference between choosing a lower-capacity model and applying a regularization constraint. In fact, in the first case, we are renouncing the possibility offered by the extra capacity, running the risk of increasing the bias, while with regularization we keep the same model but optimize it so to reduce the variance. Let's now explore the most common regularization techniques.

主站蜘蛛池模板: 浦东新区| 海丰县| 登封市| 沿河| 庐江县| 长沙县| 阿坝县| 台江县| 金川县| 西平县| 临高县| 荥经县| 稻城县| 株洲县| 洞头县| 武清区| 宁陕县| 天全县| 永城市| 龙泉市| 松潘县| 鄂托克前旗| 昔阳县| 佛坪县| 鸡西市| 玉屏| 阿坝县| 榆社县| 阳谷县| 阿瓦提县| 贺州市| 巨鹿县| 绥宁县| 岳普湖县| 天台县| 怀安县| 绥滨县| 莲花县| 江陵县| 蓝山县| 监利县|