官术网_书友最值得收藏!

How to do it

L1/L2 regularization is implemented in Keras, as follows:

model = Sequential()
model.add(Dense(1000,input_dim=784,activation='relu',kernel_regularizer=l2(0.1)))model.add(Dense(10, activation='softmax',kernel_regularizer=l2(0.1)))
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=500, batch_size=1024, verbose=1)

Note that the preceding involves invoking an additional hyperparameter—kernel_regularizerand then specifying whether it is an L1/L2 regularization. Furthermore, we also specify the lambda value that gives the weight to regularization.

We notice that, post regularization, the training dataset accuracy does not happen to be at ~100%, while the test data accuracy is at 98%. The histogram of weights post-L2 regularization is visualized in the next graph.

The weights of connecting the hidden layer to the output layer are extracted as follows:

model.get_weights()[0].flatten()

Once the weights are extracted, they are plotted as follows:

plt.hist(model.get_weights()[0].flatten())

We notice that the majority of weights are now much closer to zero when compared to the previous scenario, thus presenting a case to avoid the overfitting issue. We would see a similar trend in the case of L1 regularization.

Notice that the weight values when regularization exists are much lower when compared to the weight values when regularization is performed.

Thus, the L1 and L2 regularizations help us to avoid the overfitting issue on top of the training dataset.

主站蜘蛛池模板: 黑河市| 兴义市| 永宁县| 石景山区| 宜兴市| 禹城市| 三台县| 蓬溪县| 曲松县| 蓝山县| 台山市| 绥棱县| 宝兴县| 长岭县| 河西区| 东莞市| 岗巴县| 郴州市| 鹿泉市| 开远市| 闽侯县| 习水县| 定襄县| 徐汇区| 易门县| 宁南县| 新田县| 华容县| 惠东县| 瑞金市| 吉水县| 英吉沙县| 辽源市| 正定县| 北流市| 茶陵县| 安泽县| 钦州市| 桦南县| 成都市| 清徐县|