官术网_书友最值得收藏!

  • Python Deep Learning
  • Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
  • 484字
  • 2021-07-02 14:31:08

A brief history of contemporary deep learning

In addition to the aforementioned models, the first edition of this book included networks such as Restricted Boltzmann Machines (RBMs) and DBNs. They were popularized by Geoffrey Hinton, a Canadian scientist, and one of the most prominent deep learning researchers. Back in 1986, he was also one of the inventors of backpropagation. RBMs are a special type of generative neural network, where the neurons are organized into two layers, namely, visible and hidden. Unlike feed-forward networks, the data in an RBM can flow in both directions – from visible to hidden units, and vice versa. In 2002, Prof. Hinton introduced contrastive divergence, which is an unsupervised algorithm for training RBMs. And in 2006, he introduced Deep Belief Nets, which are deep neural networks that are formed by stacking multiple RBMs. Thanks to their novel training algorithm, it was possible to create a DBN with more hidden layers than had previously been possible. To understand this, we should explain why it was so difficult to train deep neural networks prior to that. In the past, the activation function of choice was the logistic sigmoid, shown in the following chart:

Logistic sigmoid (blue) and its derivative (green)

We now know that, to train a neural network, we need to compute the derivative of the activation function (along with all the other derivatives). The sigmoid derivative has significant value in a very narrow interval, centered around 0 and converges towards 0 in all other cases. In networks with many layers, it's highly likely that the derivative would converge to 0, when propagated to the first layers of the network. Effectively, this means we cannot update the weights in these layers. This is a famous problem called vanishing gradients and (along with a few other issues), which prevents the training of deep networks. By stacking pre-trained RBMs, DBNs were able to alleviate (but not solve) this problem.

But training a DBN is not easy. Let's look at the following steps:

  • First, we have to train each RBM with contrastive divergence, and gradually stack them on top of each other. This phase is called pre-training
  • In effect, pre-training serves as a sophisticated weight initialization algorithm for the next phase, called fine-tuning. With fine-tuning, we transform the DBN in a regular multi-layer perceptron and continue training it using supervised backpropagation, in the same way we saw in Chapter 2, Neural Networks.

However, thanks to some algorithmic advances, it's now possible to train deep networks using plain old backpropagation, thus effectively eliminating the pre-training phase. We will discuss these improvements in the coming sections, but for now, let's just say that they rendered DBNs and RBMs obsolete. DBNs and RBMs are, without a doubt, interesting from a research perspective, but are rarely used in practice anymore. Because of this, we will omit them from this edition.

主站蜘蛛池模板: 松阳县| 阿拉善左旗| 海丰县| 揭西县| 新宁县| 平武县| 崇左市| 遂溪县| 滨海县| 疏附县| 海原县| 瑞丽市| 宜都市| 奉新县| 泽州县| 龙川县| 清徐县| 永靖县| 永修县| 马龙县| 峨眉山市| 县级市| 扎赉特旗| 龙陵县| 宜君县| 赤城县| 垣曲县| 兴山县| 梓潼县| 南木林县| 苏尼特右旗| 邵阳市| 儋州市| 桑日县| 鹤岗市| 繁昌县| 德格县| 马龙县| 周口市| 湟源县| 香港 |