官术网_书友最值得收藏!

ROC test

The ROC curve is an important improvement on the false positive and true negative measures of model performance. For a detailed explanation, refer to Chapter 9 of Tattar et al. (2017). The ROC curve basically plots the true positive rate against the false positive rate, and we measure the AUC for the fitted model.

The main goal that the ROC test attempts to achieve is the following. Suppose that Model 1 gives an AUC of 0.89 and Model 2 gives 0.91. Using the simple AUC criteria, we outright conclude that Model 2 is better than Model 1. However, an important question that arises is whether 0.91 is significantly higher than 0.89. The roc.test, from the pROC R package, provides the answer here. For the neural network and classification tree, the following R segment gives the required answer:

> library(pROC)
> HT_NN_Prob <- predict(NN_fit,newdata=HT2_TestX,type="raw")
> HT_NN_roc <- roc(HT2_TestY,c(HT_NN_Prob))
> HT_NN_roc$auc
Area under the curve: 0.9723826
> HT_CT_Prob <- predict(CT_fit,newdata=HT2_TestX,type="prob")[,2]
> HT_CT_roc <- roc(HT2_TestY,HT_CT_Prob)
> HT_CT_roc$auc
Area under the curve: 0.9598765
> roc.test(HT_NN_roc,HT_CT_roc)
        DeLong's test for two correlated ROC curves
data:  HT_NN_roc and HT_CT_roc
Z = 0.72452214, p-value = 0.4687452
alternative hypothesis: true difference in AUC is not equal to 0
sample estimates:
 AUC of roc1  AUC of roc2 
0.9723825557 0.9598765432 

Since the p-value is very large, we conclude that the AUC for the two models is not significantly different.

Statistical tests are vital and we recommend that they be used whenever suitable. The concepts highlighted in this chapter will be drawn on in more detail in the rest of the book.

主站蜘蛛池模板: 友谊县| 共和县| 余庆县| 吴江市| 威宁| 乌鲁木齐市| 徐州市| 张家港市| 新绛县| 阿勒泰市| 原阳县| 朝阳县| 布尔津县| 永靖县| 台南县| 金昌市| 武隆县| 元阳县| 隆昌县| 新源县| 高青县| 惠水县| 宜兰县| 当阳市| 高雄市| 集安市| 南靖县| 钟山县| 格尔木市| 华坪县| 刚察县| 社会| 通州市| 平泉县| 辛集市| 滨海县| 林甸县| 宿迁市| 拜泉县| 凌源市| 临夏市|