官术网_书友最值得收藏!

  • Python Deep Learning
  • Ivan Vasilev Daniel Slater Gianmario Spacagna Peter Roelants Valentino Zocca
  • 654字
  • 2021-07-02 14:31:05

Putting it all together with an example

As we already mentioned, multi-layer neural networks can classify linearly separable classes. In fact, the Universal Approximation Theorem states that any continuous functions on compact subsets of Rn can be approximated by a neural network with at least one hidden layer. The formal proof of such a theorem is too complex to be explained here, but we'll attempt to give an intuitive explanation using some basic mathematics. We'll implement a neural network that approximates the boxcar function, in the following diagram on the right, which is a simple type of step function. Since a series of step functions can approximate any continuous function on a compact subset of R, this will give us an idea of why the Universal Approximation Theorem holds:

The diagram on the left depicts continuous function approximation with a series of step functions, while the diagram on the right illustrates a single boxcar step function

To do this, we'll use the logistic sigmoid activation function. As we know, the logistic sigmoid is defined as  where  :

  1. Let's assume that we have only one input neuron, x = x1
  2. In the following diagrams, we can see that by making w very large, the sigmoid becomes close to a step function. On the other hand, b will simply translate the function along the x axis, and the translation t will be equal to -b/w (t = -b/w):
On the left, we have a standard sigmoid with a weight of 1 and a bias of 0; in the middle, we have a sigmoid with a weight of 10; and on the right, we have a sigmoid with a weight of 10 and a bias of 50

With this in mind, let's define the architecture of our network. It will have a single input neuron, one hidden layer with two neurons, and a single output neuron:

Both hidden neurons use the logistic sigmoid activation. The weights and biases of the network are organized in such a way as to take advantage of the sigmoid properties we described previously. The top neuron will initiate the first transition t1 (0 to 1), and then, after some time has elapsed, the second neuron will initiate the opposite transition t2. The following code implements this example:

# The user can modify the values of the weight w
# as well as bias_value_1 and bias_value_2 to observe
# how this plots to different step functions

import matplotlib.pyplot as plt
import numpy

weight_value = 1000

# modify to change where the step function starts
bias_value_1 = 5000

# modify to change where the step function ends
bias_value_2 = -5000

# plot the
plt.axis([-10, 10, -1, 10])

print("The step function starts at {0} and ends at {1}"
.format(-bias_value_1 / weight_value,
-bias_value_2 / weight_value))

inputs = numpy.arange(-10, 10, 0.01)
outputs = list()

# iterate over a range of inputs
for x in inputs:
y1 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_1))
y2 = 1.0 / (1.0 + numpy.exp(-weight_value * x - bias_value_2))

# modify to change the height of the step function
w = 7

# network output
y = y1 * w - y2 * w

outputs.append(y)

plt.plot(inputs, outputs, lw=2, color='black')
plt.show()

We set large values for weight_value, bias_value_1, and bias_value_2. In this way, the expressions numpy.exp(-weight_value * x - bias_value_1) and numpy.exp(-weight_value * x - bias_value_2) can switch between 0 and infinity in a very short interval of the input. In turn,y1 and y2 will switch between 1 and 0. This would make for a stepwise (as opposed to gradual) logistic sigmoid shape, as explained previously. Because the numpy.exp expressions get an infinity value, the code will produce overflow encountered in exp warning, but this is normal. 

This code, when executed, produces the following result:

 

主站蜘蛛池模板: 聂拉木县| 正蓝旗| 绍兴市| 武胜县| 密云县| 定安县| 凤山县| 沅江市| 白水县| 巨野县| 枝江市| 衢州市| 遂平县| 静海县| 峨山| 临沧市| 安吉县| 荔波县| 望奎县| 合川市| 阜新| 梓潼县| 武冈市| 东乡族自治县| 武邑县| 商丘市| 泽州县| 平谷区| 绍兴市| 锡林浩特市| 九龙县| 塔城市| 鄂托克前旗| 通山县| 苏尼特右旗| 耿马| 宁安市| 威海市| 新乡县| 恩施市| 响水县|