- Advanced Machine Learning with R
- Cory Lesmeister Dr. Sunil Kumar Chinnamgari
- 530字
- 2021-06-24 14:24:37
Multivariate adaptive regression splines
In the prior chapter, we went through a discussion on MARS, how it works, why use it, and so on, so I won't duplicate that here; other than that, it can be applied in a classification problem as a generalized linear model. One of the key benefits is its power to conduct feature selection, so there's no need to run stepwise or IV—or even regularization, for that matter.
We'll train it with 5-fold cross-validation and set nprune = 15 to limit the maximum number of features at 15. Recall from the previous chapter that more than 15 terms are possible as it fits piecewise splines.
This code will give us our model object. Be advised that this may take some time to complete:
> set.seed(1972)
> earth_fit <-
earth::earth(
x = train[, -142],
y = train[, 142],
pmethod = 'cv',
nfold = 5,
degree = 1,
minspan = -1,
nprune = 15,
glm = list(family = binomial)
)
Here's the model summary:
> summary(earth_fit)
Call: earth(x=train[,-142], y=train[,142], pmethod="cv",
glm=list(family=binomial), degree=1, nprune=15, nfold=5,
minspan=-1)
GLM coefficients
y
(Intercept) -3.4407
V23 -6.6750
V24 -1.3539
V105 -0.8200
h(28-V2) -0.4769
h(V2-28) 0.0071
h(1-V31) 1.4876
h(V31-1) 0.0947
h(106449-V141) 0.0000
h(V141-106449) 0.0000
Earth selected 10 of 10 terms, and 6 of 141 predictors using pmethod="cv"
As you can see in the summary, the model ended up with six total predictive features and a total of ten terms, including a hinge function on V2. By standard protocol, the paired hinge terms can be read first predictor less than the hinge value, and then predictor greater than or equal to hinge. For instance, for V31, a value less than 1 has a coefficient of 1.4876, otherwise 0.0947.
We can plot the linear interactions with respect to predicted probability. Setting ylim to NA helps to show changes in y (predicted probability) versus changes in feature values:
> plotmo::plotmo(earth_fit, ylim = NA)
The output of the preceding code is as follows:

Notice how, for V31, values equal to zero have one coefficient else another one as described previously. Feature importance is trivial to produce:
> earth::evimp(earth_fit)
nsubsets gcv rss
V31 9 100.0 100.0
V2 8 74.8 75.0
V141 7 40.2 41.1
V105 6 31.5 32.5
V23 5 26.8 27.8
V24 4 20.7 21.8
The nsubsets criterion counts the number of model subsets that include the feature. Features that included in more subsets are considered more important. These subsets are of the terms generated by earth's pruning pass.
Now, we specified cross-validation, but earth concurrently does forward and backward feature selection and elimination, using Generalized Cross-Validation (GCV) as discussed in the previous chapter. So, GCV and rss results for a feature are normalized from 0 to 100 for comparison purposes.
Like we did previously, a plot of the probability densities is useful, and earth comes with its own plotd() function:

I like how the predicted values are reversed compared to the prior plots. Other than that, it's hard to discern anything meaningful with the exception that the densities are quite similar. Let's get the cutoff value:
> mars_cutoff <-
InformationValue::optimalCutoff(
train$y,
pred,
optimiseFor = 'Both',
returnDiagnostics = TRUE
)
Examination of the object provides the following:
- Optimal cutoff = 0.04976
- TPR = 0.6449
- FPR = 0.208
In comparison with logistic regression, we have a higher rate of finding true positives at the expense of a slightly higher rate of false positives.
Let's move on to evaluating performance on the test set:
> test_pred <- predict(earth_fit, test, type = 'response')
> Metrics::auc(test$y, test_pred)
[1] 0.8079
> Metrics::logLoss(test$y, test_pred)
[1] 0.1406
What do we see here? A slight improvement in AUC, and a lower (better) log-loss. While not dramatic, it may be of value. We can now turn to visually comparing the two models to confirm that MARS is indeed the preferred algorithm.
- 圖解西門子S7-200系列PLC入門
- 數字道路技術架構與建設指南
- Linux運維之道(第2版)
- The Applied AI and Natural Language Processing Workshop
- 硬件產品經理手冊:手把手構建智能硬件產品
- INSTANT ForgedUI Starter
- Learning Game Physics with Bullet Physics and OpenGL
- 筆記本電腦維修不是事兒(第2版)
- Creating Flat Design Websites
- Managing Data and Media in Microsoft Silverlight 4:A mashup of chapters from Packt's bestselling Silverlight books
- 單片機技術及應用
- RISC-V處理器與片上系統設計:基于FPGA與云平臺的實驗教程
- Hands-On Deep Learning for Images with TensorFlow
- 單片微機原理及應用
- The Artificial Intelligence Infrastructure Workshop