書名： Deep Learning Quick Reference
作者名： Mike Bernico
本章字數： 98字
更新時間： 2021-06-24 18:40:05

Using momentum with gradient descent

Using gradient descent with momentum speeds up gradient descent by increasing the speed of learning in directions the gradient has been constant in direction while slowing learning in directions the gradient fluctuates in direction. It allows the velocity of gradient descent to increase.

Momentum works by introducing a velocity term, and using a weighted moving average of that term in the update rule, as follows:

Most typically is set to 0.9 in the case of momentum, and usually this is not a hyper-parameter that needs to be changed.

官术网_书友最值得收藏!

Deep Learning Quick Reference

Using momentum with gradient descent