官术网_书友最值得收藏!

The hyperbolic tangent activation function

The output, y, of a hyperbolic tangent activation function (tanh) as a function of its total input, x, is given as follows:

The tanh activation function outputs values in the range [-11], as you can see in the following graph:

Figure 1.7: Tanh activation function

One thing to note is that both the sigmoid and the tanh activation functions are linear within a small range of the input, beyond which the output saturates. In the saturation zone, the gradients of the activation functions (with respect to the input) are very small or close to zero; this means that they are very prone to the vanishing gradient problem. As you will see later on, neural networks learn from the backpropagation method, where the gradient of a layer is dependent on the gradients of the activation units in the succeeding layers, up to the final output layer. Therefore, if the units in the activation units are working in the saturation region, much less of the error is backpropagated to the early layers of the neural network. Neural networks minimize the prediction error in order to learn the weights and biases (W) by utilizing the gradients. This means that, if the gradients are small or vanish to zero, then the neural network will fail to learn these weights properly.

主站蜘蛛池模板: 察雅县| 专栏| 阿拉尔市| 黄骅市| 乌拉特前旗| 淮南市| 杭州市| 昌宁县| 英德市| 柳河县| 蒙自县| 嘉兴市| 玉门市| 高陵县| 犍为县| 井陉县| 伊金霍洛旗| 原平市| 洛阳市| 汾阳市| 登封市| 罗定市| 苏尼特左旗| 蒙城县| 四川省| 清苑县| 遂平县| 芜湖县| 南康市| 沽源县| 景泰县| 上蔡县| 如皋市| 张掖市| 麻栗坡县| 崇左市| 大厂| 农安县| 太仓市| 乳源| 西乌|