官术网_书友最值得收藏!

K-Nearest Neighbors and Support Vector Machines

"Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write."
–H.G. Wells

In Chapter 3, Logistic Regression, we discussed using generalized linear models to determine the probability that a predicted observation belongs to a categorical response what we refer to as a classification problem. That was just the beginning of classification methods, with many techniques that we can use to try and improve our predictions.

In this chapter, we'll delve into two nonlinear techniques: K-Nearest Neighbors (KNN) and Support Vector Machines (SVMs). These techniques are more sophisticated than those we discussed earlier because the assumptions on linearity can be relaxed, which means a linear combination of the features to define the decision boundary isn't needed. Be forewarned, though, that this doesn't always equal superior predictive ability. Additionally, these models can be a bit problematic to interpret for business partners, and they can be computationally inefficient. When used wisely, they provide a powerful complement to the other tools and techniques discussed in this book. They can be used for continuous outcomes in addition to classification problems; however, for this chapter, we'll focus only on the latter.

After a high-level background on the techniques, we'll put both of them to the test, starting with KNN.

Following are the topics that we'll be covering in this chapter:

  • K-nearest neighbors
  • Support vector machines
  • Manipulating data
  • Modeling and evaluation
主站蜘蛛池模板: 万盛区| 大冶市| 美姑县| 苏尼特左旗| 沧源| 广灵县| 锡林郭勒盟| 九寨沟县| 黔西县| 尼木县| 湖南省| 永泰县| 盱眙县| 中西区| 乌兰浩特市| 化州市| 瓦房店市| 深泽县| 东莞市| 安平县| 漳平市| 眉山市| 汾西县| 呼伦贝尔市| 错那县| 高阳县| 潜山县| 札达县| 宜良县| 邛崃市| 大埔区| 天镇县| 朝阳市| 长汀县| 壶关县| 拜城县| 新晃| 晋城| 通山县| 芦溪县| 太谷县|