官术网_书友最值得收藏!

LASSO

LASSO applies the L1-norm instead of the L2-norm as in ridge regression, which is the sum of the absolute value of the feature weights and thus minimizes RSS + λ(sum |Bj|). This shrinkage penalty will indeed force a feature weight to zero. This is a clear advantage over ridge regression, as it may greatly improve the model interpretability.

The mathematics behind the reason that the L1-norm allows the weights/coefficients to become zero, is out of the scope of this book (refer to Tibsharini, 1996 for further details).

If LASSO is so great, then ridge regression must be clearly obsolete. Not so fast! In a situation of high collinearity or high pairwise correlations, LASSO may force a predictive feature to zero and thus you can lose the predictive ability; that is, say if both feature A and B should be in your model, LASSO may shrink one of their coefficients to zero. The following quote sums up this issue nicely:

"One might expect the lasso to perform better in a setting where a relatively small number of predictors have substantial coefficients, and the remaining predictors have coefficients that are very small or that equal zero. Ridge regression will perform better when the response is a function of many predictors, all with coefficients of roughly equal size."
                                                                                                                     -(James, 2013)

There is the possibility of achieving the best of both the worlds and that leads us to the next topic, elastic net.

主站蜘蛛池模板: 敦煌市| 荥阳市| 龙山县| 大关县| 綦江县| 安庆市| 保靖县| 洛浦县| 依安县| 辉县市| 偏关县| 台州市| 永平县| 巢湖市| 河池市| 银川市| 托克逊县| 普兰店市| 通江县| 耿马| 崇义县| 梧州市| 通海县| 五大连池市| 阿合奇县| 黎平县| 兰州市| 和田市| 巴南区| 明水县| 安平县| 正定县| 保定市| 丰都县| 嘉禾县| 公安县| 肃北| 南澳县| 中方县| 秦皇岛市| 江川县|