官术网_书友最值得收藏!

  • Deep Learning with PyTorch
  • Vishnu Subramanian
  • 182字
  • 2021-06-24 19:16:28

ReLU

ReLU has become more popular in the recent years; we can find either its usage or one of its variants' usages in almost any modern architecture. It has a simple mathematical formulation:

f(x)=max(0,x)

In simple words, ReLU squashes any input that is negative to zero and leaves positive numbers as they are. We can visualize the ReLU function as follows:

Image source: http://datareview.info/article/eto-nuzhno-znat-klyuchevyie-rekomendatsii-po-glubokomu-obucheniyu-chast-2/

Some of the pros and cons of using ReLU are as follows:

  • It helps the optimizer in finding the right set of weights sooner. More technically it makes the convergence of stochastic gradient descent faster.
  • It is computationally inexpensive, as we are just thresholding and not calculating anything like we did for the sigmoid and tangent functions.
  • ReLU has one disadvantage; when a large gradient passes through it during the backward propagation, they often become non-responsive; these are called dead neutrons, which can be controlled by carefully choosing the learning rate. We will discuss how to choose learning rates when we discuss the different ways to adjust the learning rate in Chapter 4, Fundamentals of Machine Learning.
主站蜘蛛池模板: 青龙| 寻乌县| 娱乐| 天津市| 永定县| 阜阳市| 唐山市| 高密市| 桦川县| 上思县| 兴城市| 察雅县| 景德镇市| 娱乐| 拜泉县| 阳春市| 家居| 神农架林区| 六安市| 博野县| 阿拉善盟| 大悟县| 陕西省| 怀安县| 汪清县| 环江| 黎城县| 丰县| 桐城市| 上杭县| 香港| 临城县| 武强县| 昆明市| 蓬溪县| 拜城县| 淳化县| 乌拉特前旗| 甘洛县| 阳信县| 新平|