官术网_书友最值得收藏!

How to do it...

  1. Import libraries as follows:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import SGD

SEED = 2017
  1. Load the dataset:
data = pd.read_csv('Data/winequality-red.csv', sep=';')
y = data['quality']
X = data.drop(['quality'], axis=1)
  1. Split the dataset into training and testing:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=SEED)
  1. Normalize the input data:
scaler = StandardScaler().fit(X_train)
X_train = pd.DataFrame(scaler.transform(X_train))
X_test = pd.DataFrame(scaler.transform(X_test))
  1. Define the model and optimizer and compile:
model = Sequential()
model.add(Dense(1024, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(1024, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dense(512, activation='relu'))
# Output layer
model.add(Dense(1, activation='linear'))
# Set optimizer
opt = SGD()
# Compile model
model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])
  1. Set the hyperparameters and train the model:
n_epochs = 500
batch_size = 256

history = model.fit(X_train.values, y_train, batch_size=batch_size, epochs=n_epochs, validation_split=0.2, verbose=0)
  1. Predict on the test set:
predictions = model.predict(X_test.values)
print('Test accuracy: {:f>2}%'.format(np.round(np.sum([y_test==predictions.flatten().round()])/y_test.shape[0]*100, 2)))
  1. Plot the training and validation accuracy:
plt.plot(np.arange(len(history.history['acc'])), history.history['acc'], label='training')
plt.plot(np.arange(len(history.history['val_acc'])), history.history['val_acc'], label='validation')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy ')
plt.legend(loc=0)
plt.show()

The following graph is obtained:

Figure 2.12: Training and validation accuracy
We should focus on the validation accuracy and use early stopping to stop the training after around 450 epochs. This results in the highest validation accuracy. in the sections  Improving generalization with regularization and A`dding dropout to prevent overfitting, we will introduce techniques to prevent overfitting. By using these techniques, we can create deeper models without overfitting on the training data.  
主站蜘蛛池模板: 方城县| 武安市| 渭源县| 姚安县| 通州市| 抚远县| 蒙城县| 历史| 和田县| 昌宁县| 龙泉市| 巨鹿县| 灵山县| 汝阳县| 巴青县| 临湘市| 枞阳县| 新竹县| 瑞金市| 武川县| 抚顺市| 麻城市| 鞍山市| 寿光市| 新昌县| 工布江达县| 潞城市| 禹州市| 闽侯县| 托里县| 丰镇市| 怀仁县| 洛南县| 犍为县| 沙湾县| 灵宝市| 桐梓县| 漳浦县| 光泽县| 柘荣县| 温泉县|