官术网_书友最值得收藏!

Summary

In this chapter, we looked at using probabilistic linear models to predict a qualitative response with two generalized linear model methods: logistic regression, and multivariate adaptive regression splines. We explored using the weight of information and information value as a technique to do univariate feature selection. We covered the concept of finding the proper probability threshold to minimize classification error. Additionally, we began the process of using various performance metrics such as AUC, log-loss, and ROC charts to explore model selection visually and statistically. These metrics proved to be more informative than just pure accuracy, especially in a situation where class labels are highly imbalanced. In the next chapter, we'll cover regularization methods for feature selection, and how it can be used in training your algorithms. We'll see how we can create a dataset. We'll know about ridge regression and dive deeper in feature selection.

主站蜘蛛池模板: 陇西县| 北安市| 达拉特旗| 金山区| 大洼县| 信丰县| 大渡口区| 民县| 肃宁县| 化隆| 海林市| 镇远县| 墨玉县| 铜川市| 刚察县| 日照市| 崇阳县| 获嘉县| 九寨沟县| 肃宁县| 焉耆| 太仆寺旗| 清水河县| 新泰市| 墨江| 邮箱| 北票市| 启东市| 西宁市| 枣强县| 淮阳县| 小金县| 应城市| 嘉黎县| 平和县| 文成县| 兴山县| 鄂托克前旗| 咸宁市| 扶绥县| 蓬莱市|