- Effective Amazon Machine Learning
- Alexis Perrier
- 186字
- 2021-07-03 00:17:53
Regularization on linear models
The Stochastic Gradient Descent algorithm (SGD) finds the optimal weights {wi} of the model by minimizing the error between the true and the predicted values on the N training samples:

Where are the predicted values, ?i the real values to be predicted; we have N samples, and each sample has n dimensions.
Regularization consists of adding a term to the previous equation and to minimize the regularized error:

The parameter helps quantify the amount of regularization, while R(w) is the regularization term dependent on the regression coefficients.
There are two types of weight constraints usually considered:
- L2 regularization as the sum of the squares of the coefficients:

- L1 regularization as the sum of the absolute value of the coefficients:

The constraint on the coefficients introduced by the regularization term R(w) prevents the model from overfitting the training data. The coefficients become tied together by the regularization and can no longer be tightly leashed to the predictors. Each type of regularization has its characteristic and gives rise to different variations on the SGD algorithm, which we now introduce:
- Python數據分析與挖掘實戰
- 云計算環境下的信息資源集成與服務
- SQL Server 2008數據庫應用技術(第二版)
- 區塊鏈:看得見的信任
- 深入淺出MySQL:數據庫開發、優化與管理維護(第2版)
- 大數據:從概念到運營
- 大話Oracle Grid:云時代的RAC
- 大數據Hadoop 3.X分布式處理實戰
- Remote Usability Testing
- Starling Game Development Essentials
- 探索新型智庫發展之路:藍迪國際智庫報告·2015(下冊)
- 數據分析師養成寶典
- 數據庫技術與應用:SQL Server 2008
- Managing Software Requirements the Agile Way
- 數據庫高效優化:架構、規范與SQL技巧