官术网_书友最值得收藏!

Waveform

This dataset is an example of a simulation study. Here, we have twenty-one variables as input or independent variables, and a class variable referred to as classes. The data is generated using the mlbench.waveform function from the mlbench R package. For more details, refer to the following link: ftp://ftp.ics.uci.edu/pub/machine-learning-databases. We will simulate 5,000 observations for this dataset. As mentioned earlier, the set.seed function guarantees reproducibility. Since we are solving binary classification problems, we will reduce the three classes generated by the waveform function to two, and then partition the data into training and testing parts for model building and testing purposes:

> library(mlbench)
> set.seed(123)
> Waveform <- mlbench.waveform(5000)
> table(Waveform$classes)
   1    2    3 
1687 1718 1595 
> Waveform$classes <- ifelse(Waveform$classes!=3,1,2)
> Waveform_DF <- data.frame(cbind(Waveform$x,Waveform$classes)) # Data Frame
> names(Waveform_DF) <- c(paste0("X",".",1:21),"Classes")
> Waveform_DF$Classes <- as.factor(Waveform_DF$Classes)
> table(Waveform_DF$Classes)
   1    2 
3405 1595 

The R function mlbench.waveform creates a new object of the mlbench class. Since it consists of two sub-parts in x and classes, we will convert it into data.frame following some further manipulations. The cbind function binds the two objects x (a matrix) and classes (a numeric vector) into a single matrix. The data.frame function converts the matrix object into a data frame, which is the class desired for the rest of the program.

After partitioning the data, we will create the required formula for the waveform dataset:

> set.seed(12345)
> Train_Test <- sample(c("Train","Test"),nrow(Waveform_DF),replace = TRUE,
+ prob = c(0.7,0.3))
> head(Train_Test)
[1] "Test"  "Test"  "Test"  "Test"  "Train" "Train"
> Waveform_DF_Train <- Waveform_DF[Train_Test=="Train",]
> Waveform_DF_TestX <- within(Waveform_DF[Train_Test=="Test",],rm(Classes))
> Waveform_DF_TestY <- Waveform_DF[Train_Test=="Test","Classes"]
> Waveform_DF_Formula <- as.formula("Classes~.")
主站蜘蛛池模板: 汾阳市| 长兴县| 吉木乃县| 宜兰县| 家居| 广东省| 灵山县| 肥东县| 诸城市| 萝北县| 屏边| 怀化市| 乾安县| 清水县| 阿拉善左旗| 大田县| 东乌| 桐柏县| 萍乡市| 道孚县| 肇庆市| 花垣县| 莱阳市| 上饶县| 巨野县| 手游| 西华县| 资阳市| 石棉县| 上蔡县| 北安市| 和田市| 攀枝花市| 湛江市| 河东区| 三原县| 香河县| 阳西县| 沙河市| 万安县| 利川市|