官术网_书友最值得收藏!

  • Deep Learning Essentials
  • Wei Di Anurag Bhardwaj Jianing Wei
  • 131字
  • 2021-06-30 19:17:53

Choosing the right activation function

In most cases, we should always consider ReLU first. But keep in mind that ReLU should only be applied to hidden layers. If your model suffers from dead neurons, then think about adjusting your learning rate, or try Leaky ReLU or maxout.

It is not recommended to use either sigmoid or tanh as they suffer from the vanishing gradient problem and also converge very slowly. Take sigmoid for example. Its derivative is greater than 0.25 everywhere, making terms during backpropagating even smaller. While for ReLU, its derivative is one at every point above zero, thus creating a more stable network.

Now you have gained a basic knowledge of the key components in neural networks, let's move on to understanding how the networks learn from data. 

主站蜘蛛池模板: 神农架林区| 获嘉县| 股票| 弥渡县| 宣武区| 益阳市| 时尚| 河曲县| 龙泉市| 桦川县| 健康| 安龙县| 东乡族自治县| 赤峰市| 钟山县| 密山市| 封丘县| 且末县| 雅安市| 长寿区| 广水市| 呈贡县| 长岭县| 高雄县| 东丽区| 郴州市| 伊通| 平利县| 都安| 遂川县| 乌什县| 旌德县| 邢台县| 阿城市| 岑巩县| 云林县| 肥东县| 英超| 稷山县| 黑龙江省| 新巴尔虎右旗|