官术网_书友最值得收藏!

The regression model

The previous section developed a deep learning model for a binary classification task, this section develops a deep learning model to predict a continuous numeric value, regression analysis. We use the same dataset that we used for the binary classification task, but we use a different target column to predict for. In that task, we wanted to predict whether a customer would return to our stores in the next 14 days. In this task, we want to predict how much a customer will spend in our stores in the next 14 days. We follow a similar process; we load and prepare our dataset by applying log transformations to the data. The code is in Chapter4/regression.R:

set.seed(42)
fileName <- "../dunnhumby/predict.csv"
dfData <- read_csv(fileName,
col_types = cols(
.default = col_double(),
CUST_CODE = col_character(),
Y_categ = col_integer())
)
nobs <- nrow(dfData)
train <- sample(nobs, 0.9*nobs)
test <- setdiff(seq_len(nobs), train)
predictorCols <- colnames(dfData)[!(colnames(dfData) %in% c("CUST_CODE","Y_numeric","Y_numeric"))]

dfData[, c("Y_numeric",predictorCols)] <- log(0.01+dfData[, c("Y_numeric",predictorCols)])
trainData <- dfData[train, c(predictorCols,"Y_numeric")]
testData <- dfData[test, c(predictorCols,"Y_numeric")]

xtrain <- model.matrix(Y_numeric~.,trainData)
xtest <- model.matrix(Y_numeric~.,testData)

We then perform regression analysis on the data using lm to create a benchmark before creating a deep learning model:

# lm Regression Model
regModel1=lm(Y_numeric ~ .,data=trainData)
pr1 <- predict(regModel1,testData)
rmse <- sqrt(mean((exp(pr1)-exp(testData[,"Y_numeric"]$Y_numeric))^2))
print(sprintf(" Regression RMSE = %1.2f",rmse))
[1] " Regression RMSE = 29.30"
mae <- mean(abs(exp(pr1)-exp(testData[,"Y_numeric"]$Y_numeric)))
print(sprintf(" Regression MAE = %1.2f",mae))
[1] " Regression MAE = 13.89"

We output two metrics, rmse and mae, for our regression task. We covered these earlier in the chapter. Mean absolute error measures the absolute differences between the predicted value and the actual value. Root mean squared error (rmse) penalizes the square of the differences between the predicted value and the actual value, so one big error costs more than the sum of the small errors. Now let's look at the deep learning regression code. First we load the data and define the model:

require(mxnet)
Loading required package: mxnet

# MXNet expects matrices
train_X <- data.matrix(trainData[, predictorCols])
test_X <- data.matrix(testData[, predictorCols])
train_Y <- trainData$Y_numeric

set.seed(42)
# hyper-parameters
num_hidden <- c(256,128,128,64)
drop_out <- c(0.4,0.4,0.4,0.4)
wd=0.00001
lr <- 0.0002
num_epochs <- 100
activ <- "tanh"

# create our model architecture
# using the hyper-parameters defined above
data <- mx.symbol.Variable("data")
fc1 <- mx.symbol.FullyConnected(data, name="fc1", num_hidden=num_hidden[1])
act1 <- mx.symbol.Activation(fc1, name="activ1", act_type=activ)
drop1 <- mx.symbol.Dropout(data=act1,p=drop_out[1])

fc2 <- mx.symbol.FullyConnected(drop1, name="fc2", num_hidden=num_hidden[2])
act2 <- mx.symbol.Activation(fc2, name="activ2", act_type=activ)
drop2 <- mx.symbol.Dropout(data=act2,p=drop_out[2])

fc3 <- mx.symbol.FullyConnected(drop2, name="fc3", num_hidden=num_hidden[3])
act3 <- mx.symbol.Activation(fc3, name="activ3", act_type=activ)
drop3 <- mx.symbol.Dropout(data=act3,p=drop_out[3])

fc4 <- mx.symbol.FullyConnected(drop3, name="fc4", num_hidden=num_hidden[4])
act4 <- mx.symbol.Activation(fc4, name="activ4", act_type=activ)
drop4 <- mx.symbol.Dropout(data=act4,p=drop_out[4])

fc5 <- mx.symbol.FullyConnected(drop4, name="fc5", num_hidden=1)
lro <- mx.symbol.LinearRegressionOutput(fc5)

Now we train the model; note that the first comment shows how to switch to using a GPU instead of a CPU:

# run on cpu, change to 'devices <- mx.gpu()'
# if you have a suitable GPU card
devices <- mx.cpu()
mx.set.seed(0)
tic <- proc.time()
# This actually trains the model
model <- mx.model.FeedForward.create(lro, X = train_X, y = train_Y,
ctx = devices,num.round = num_epochs,
learning.rate = lr, momentum = 0.9,
eval.metric = mx.metric.rmse,
initializer = mx.init.uniform(0.1),
wd=wd,
epoch.end.callback = mx.callback.log.train.metric(1))
print(proc.time() - tic)
user system elapsed
13.90 1.82 10.50

pr4 <- predict(model, test_X)[1,]
rmse <- sqrt(mean((exp(pr4)-exp(testData[,"Y_numeric"]$Y_numeric))^2))
print(sprintf(" Deep Learning Regression RMSE = %1.2f",rmse))
[1] " Deep Learning Regression RMSE = 28.92"
mae <- mean(abs(exp(pr4)-exp(testData[,"Y_numeric"]$Y_numeric)))
print(sprintf(" Deep Learning Regression MAE = %1.2f",mae))
[1] " Deep Learning Regression MAE = 14.33"
rm(data,fc1,act1,fc2,act2,fc3,act3,fc4,lro,model)

For regression metrics, lower is better, so our rmse metric on the deep learning model (28.92) is an improvement on the original regression model (29.30). Interestingly, the mae on the the deep learning model (14.33) is actually worse than the original regression model (13.89). Since rsme penalizes big differences between actual and predicted values more, this indicates that the errors in the deep learning model are less extreme than the regression model.

主站蜘蛛池模板: 石景山区| 教育| 隆林| 浙江省| 府谷县| 灌阳县| 长海县| 沂源县| 泗洪县| 阿鲁科尔沁旗| 东兴市| 襄垣县| 白山市| 新巴尔虎右旗| 金秀| 富源县| 张家口市| 高唐县| 嘉禾县| 南漳县| 南投县| 固镇县| 隆德县| 海盐县| 辽阳县| 建昌县| 上饶县| 项城市| 翼城县| 东阿县| 奉贤区| 曲靖市| 辽宁省| 沂水县| 甘谷县| 祥云县| 普格县| 安西县| 婺源县| 化德县| 霍城县|