- Neural Network Programming with TensorFlow
- Manpreet Singh Ghotra Rajdeep Dua
- 604字
- 2021-07-02 15:17:13
Analyzing the effect of activation functions on the feedforward networks accuracy
In the previous example, we used RELU as the activation function. TensorFlow supports multiple activation functions. Let's look at how each of these activation functions affects validation accuracy. We will generate some random values:
x_val = np.linspace(start=-10., stop=10., num=1000)
Then generate the activation output:
# ReLU activation
y_relu = session.run(tf.nn.relu(x_val))
# ReLU-6 activation
y_relu6 = session.run(tf.nn.relu6(x_val))
# Sigmoid activation
y_sigmoid = session.run(tf.nn.sigmoid(x_val))
# Hyper Tangent activation
y_tanh = session.run(tf.nn.tanh(x_val))
# Softsign activation
y_softsign = session.run(tf.nn.softsign(x_val))
# Softplus activation
y_softplus = session.run(tf.nn.softplus(x_val))
# Exponential linear activation
y_elu = session.run(tf.nn.elu(x_val))
Plot the activation against x_val:
plt.plot(x_val, y_softplus, 'r--', label='Softplus', linewidth=2)
plt.plot(x_val, y_relu, 'b:', label='RELU', linewidth=2)
plt.plot(x_val, y_relu6, 'g-.', label='RELU6', linewidth=2)
plt.plot(x_val, y_elu, 'k-', label='ELU', linewidth=1)
plt.ylim([-1.5,7])
plt.legend(loc='top left')
plt.title('Activation functions', y=1.05)
plt.show()
plt.plot(x_val, y_sigmoid, 'r--', label='Sigmoid', linewidth=2)
plt.plot(x_val, y_tanh, 'b:', label='tanh', linewidth=2)
plt.plot(x_val, y_softsign, 'g-.', label='Softsign', linewidth=2)
plt.ylim([-1.5,1.5])
plt.legend(loc='top left')
plt.title('Activation functions with Vanishing Gradient', y=1.05)
plt.show()
Plots are shown in the following screenshot:

The plot comparing Activation functions with Vanishing Gradient is as follows:

Now let's look at the activation function and how it affects validation accuracy for NotMNIST data.
We have modified the previous example so that we can pass the activation function as a parameter in main():
RELU = 'RELU'
RELU6 = 'RELU6'
CRELU = 'CRELU'
SIGMOID = 'SIGMOID'
ELU = 'ELU'
SOFTPLUS = 'SOFTPLUS'
def activation(name, features):
if name == RELU:
return tf.nn.relu(features)
if name == RELU6:
return tf.nn.relu6(features)
if name == SIGMOID:
return tf.nn.sigmoid(features)
if name == CRELU:
return tf.nn.crelu(features)
if name == ELU:
return tf.nn.elu(features)
if name == SOFTPLUS:
return tf.nn.softplus(features)
The run() function definition encompasses the login that we defined earlier:
batch_size = 128
#activations = [RELU, RELU6, SIGMOID, CRELU, ELU, SOFTPLUS]
activations = [RELU, RELU6, SIGMOID, ELU, SOFTPLUS]
plot_loss = False
def run(name):
print(name)
with open(pickle_file, 'rb') as f:
save = pickle.load(f)
training_dataset = save['train_dataset']
training_labels = save['train_labels']
validation_dataset = save['valid_dataset']
validation_labels = save['valid_labels']
test_dataset = save['test_dataset']
test_labels = save['test_labels']
train_dataset, train_labels = reformat(training_dataset, training_labels)
valid_dataset, valid_labels = reformat(validation_dataset,
validation_labels)
test_dataset, test_labels = reformat(test_dataset, test_labels)
graph = tf.Graph()
no_of_neurons = 1024
with graph.as_default():
tf_train_dataset = tf.placeholder(tf.float32,
shape=(batch_size, image_size * image_size))
tf_train_labels = tf.placeholder(tf.float32, shape=(batch_size,
num_of_labels))
tf_valid_dataset = tf.constant(valid_dataset)
tf_test_dataset = tf.constant(test_dataset)
# Define Variables.
# Training computation...
# Optimizer ..
# Predictions for the training, validation, and test data.
train_prediction = tf.nn.softmax(logits)
valid_prediction = tf.nn.softmax(
tf.matmul(activation(name,tf.matmul(tf_valid_dataset, w1) + b1), w2) + b2)
test_prediction = tf.nn.softmax(
tf.matmul(activation(name,tf.matmul(tf_test_dataset, w1) + b1), w2) + b2)
num_steps = 101
minibatch_acc = []
validation_acc = []
loss_array = []
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
print("Initialized")
for step in xrange(num_steps):
offset = (step * batch_size) % (train_labels.shape[0] - batch_size)
# Generate a minibatch.
batch_data = train_dataset[offset:(offset + batch_size), :]
batch_labels = train_labels[offset:(offset + batch_size), :]
feed_dict = {tf_train_dataset: batch_data, tf_train_labels: batch_labels}
_, l, predictions = session.run(
[optimizer, loss, train_prediction], feed_dict=feed_dict)
minibatch_accuracy = accuracy(predictions, batch_labels)
validation_accuracy = accuracy(
valid_prediction.eval(), valid_labels)
if (step % 10 == 0):
print("Minibatch loss at step", step, ":", l)
print("Minibatch accuracy: %.1f%%" % accuracy(predictions,
batch_labels))
print("Validation accuracy: %.1f%%" % accuracy(
valid_prediction.eval(), valid_labels))
minibatch_acc.append(minibatch_accuracy)
validation_acc.append(validation_accuracy)
loss_array.append(l)
print("Test accuracy: %.1f%%" % accuracy(test_prediction.eval(),
test_labels))
return validation_acc, loss_array
Plots from the preceding list are shown in the following screenshot:

As can be seen in the preceding graphs, RELU and RELU6 provide maximum validation accuracy, which is close to 60 percent. Now let's look at how training loss behaves as we progress through the steps for various activations:

Training loss converges to zero for most of the activation functions, though RELU is the least effective in the short-term.
- 同步:秩序如何從混沌中涌現
- 大規模數據分析和建模:基于Spark與R
- 云計算環境下的信息資源集成與服務
- Sybase數據庫在UNIX、Windows上的實施和管理
- Oracle PL/SQL實例精解(原書第5版)
- LabVIEW 完全自學手冊
- IPython Interactive Computing and Visualization Cookbook(Second Edition)
- SAS金融數據挖掘與建模:系統方法與案例解析
- 企業大數據處理:Spark、Druid、Flume與Kafka應用實踐
- PostgreSQL高可用實戰
- Oracle 11g數據庫管理員指南
- Swift Functional Programming(Second Edition)
- 標簽類目體系:面向業務的數據資產設計方法論
- 領域驅動設計精粹
- 推薦系統全鏈路設計:原理解讀與業務實踐