官术网_书友最值得收藏!

ReLU

The Rectified Linear Unit (ReLU) has become quite popular in recent years. Its mathematical formula is as follows:

Compared to sigmoid and tanh, its computation is much simpler and more efficient. It was proved that it improves convergence by six times (for example, a factor of 6 in Krizhevsky and it's co-authors in their work of ImageNet Classification with Deep Convolutional Neural Networks, 2012), possibly due to the fact that it has a linear and non-saturating form. Also, unlike tanh or sigmoid functions which involve the expensive exponential operation, ReLU can be achieved by simply thresholding activation at zero. Therefore, it has become very popular over the last couple of years. Almost all deep learning models use ReLU nowadays. Another important advantage of ReLU is that it avoids or rectifies the vanishing gradient problem.

Its limitation resides in the fact that its direct output is not in the probability space. It cannot be used in the output layer, but only in the hidden layers. Therefore, for classification problems, one needs to use the softmax function on the last layer to compute the probabilities for classes. For a regression problem, one should simply use a linear function. Another problem with ReLU is that it can cause dead neuron problems. For example, if large gradients flow through ReLU, it may cause the weights to be updated such that a neuron will never be active on any other future data points.

To fix this problem, another modification was introduced called Leaky ReLU. To fix the problem of dying neurons it introduces a small slope to keep the updates alive.

主站蜘蛛池模板: 澜沧| 周至县| 新巴尔虎左旗| 凌源市| 监利县| 昭觉县| 比如县| 蒲城县| 沈丘县| 乐安县| 兴化市| 禹城市| 离岛区| 兖州市| 原阳县| 克拉玛依市| 晋中市| 广宗县| 滦平县| 武平县| 德阳市| 民县| 浠水县| 沙坪坝区| 庄浪县| 江门市| 衡山县| 富裕县| 盐池县| 汤原县| 宜君县| 清丰县| 宜良县| 屏南县| 宜兴市| 伊金霍洛旗| 建瓯市| 永寿县| 关岭| 宜州市| 卢氏县|