官术网_书友最值得收藏!

Using momentum with gradient descent

Using gradient descent with momentum speeds up gradient descent by increasing the speed of learning in directions the gradient has been constant in direction while slowing learning in directions the gradient fluctuates in direction. It allows the velocity of gradient descent to increase. 

Momentum works by introducing a velocity term, and using a weighted moving average of that term in the update rule, as follows:

Most typically   is set to 0.9 in the case of momentum, and usually this is not a hyper-parameter that needs to be changed.

主站蜘蛛池模板: 建昌县| 永城市| 万安县| 东莞市| 信宜市| 张北县| 赤壁市| 庄浪县| 阿克苏市| 栾城县| 香港| 南靖县| 南安市| 河东区| 大理市| 闽清县| 托里县| 平果县| 吐鲁番市| 黔江区| 阜宁县| 鲁山县| 上饶县| 涟水县| 格尔木市| 汝州市| 禄劝| 新龙县| 大庆市| 南皮县| 垫江县| 奉新县| 宝兴县| 天津市| 晋州市| 江津市| 曲阜市| 陆良县| 江安县| 孟连| 宝坻区|