官术网_书友最值得收藏!

How to do it

L1/L2 regularization is implemented in Keras, as follows:

model = Sequential()
model.add(Dense(1000,input_dim=784,activation='relu',kernel_regularizer=l2(0.1)))model.add(Dense(10, activation='softmax',kernel_regularizer=l2(0.1)))
model.compile(loss='categorical_crossentropy', optimizer=adam, metrics=['accuracy'])
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=500, batch_size=1024, verbose=1)

Note that the preceding involves invoking an additional hyperparameter—kernel_regularizerand then specifying whether it is an L1/L2 regularization. Furthermore, we also specify the lambda value that gives the weight to regularization.

We notice that, post regularization, the training dataset accuracy does not happen to be at ~100%, while the test data accuracy is at 98%. The histogram of weights post-L2 regularization is visualized in the next graph.

The weights of connecting the hidden layer to the output layer are extracted as follows:

model.get_weights()[0].flatten()

Once the weights are extracted, they are plotted as follows:

plt.hist(model.get_weights()[0].flatten())

We notice that the majority of weights are now much closer to zero when compared to the previous scenario, thus presenting a case to avoid the overfitting issue. We would see a similar trend in the case of L1 regularization.

Notice that the weight values when regularization exists are much lower when compared to the weight values when regularization is performed.

Thus, the L1 and L2 regularizations help us to avoid the overfitting issue on top of the training dataset.

主站蜘蛛池模板: 广元市| 鹤岗市| 永靖县| 瑞丽市| 海安县| 广昌县| 南陵县| 稷山县| 宽城| 巧家县| 枣阳市| 嘉荫县| 安达市| 永仁县| 佛冈县| 托克逊县| 合肥市| 陆丰市| 新乡市| 余江县| 北辰区| 余江县| 确山县| 临潭县| 广南县| 嘉黎县| 姚安县| 陕西省| 衡阳县| 桐城市| 遵义市| 巫溪县| 扶沟县| 安阳县| 盈江县| 曲麻莱县| 英德市| 彝良县| 松潘县| 新巴尔虎左旗| 民乐县|