官术网_书友最值得收藏!

What this book covers

Here is a list of changes compared with the second edition by chapter.

Chapter 1, Preparing and Understanding Data, covers the loading of data and demonstrates how to obtain an understanding of its structure and dimensions, as well as how to install the necessary packages.

Chapter 2, Linear Regressioncontains improved code, and superior charts have been provided; other than that, it remains relatively close to the original.

Chapter 3, Logistic Regression, contains improved and streamlined code. One of my favorite techniques, multivariate adaptive regression splines, has been added. This technique performs well, handles non-linearity, and is easy to explain. It is my base model.

Chapter 4, Advanced Feature Selection in Linear Modelsincludes techniques not only for regression, but also for a classification problem.

Chapter 5, K-Nearest Neighbors and Support Vector Machines, includes streamlined and simplified code.

Chapter 6, Tree-Based Classification, is augmented by the addition of the very popular techniques provided by the XGBOOST package. Additionally, the technique of using a random forest as a feature selection tool is incorporated.

Chapter 7, Neural Networks and Deep Learning, has been updated with additional information on deep learning methods and includes improved code for the H2O package, including hyperparameter search.

Chapter 8, Creating Ensembles and Multiclass Methods, has completely new content, involving the utilization of several great packages. 

Chapter 9, Cluster Analysis, includes the methodology for executing unsupervised learning with random forests added.

Chapter 10Principal Component Analysis, uses a different dataset, while an out-of-sample prediction has been added.

Chapter 11, Association Analysis, explains association analysis, and applies not only to making recommendations, product placement, and promotional pricing, but can also be used in manufacturing, web usage, and healthcare.

Chapter 12, Time Series and Causality, includes a couple of additional years of climate data, along with a demonstration of different causality test methods.

Chapter 13, Text Mining, includes additional data and improved code.

AppendixCreating a Package, includes additional data packages.

主站蜘蛛池模板: 康保县| 略阳县| 普洱| 永嘉县| 凤台县| 凤凰县| 县级市| 望谟县| 万宁市| 奉新县| 洞头县| 府谷县| 通河县| 容城县| 若羌县| 沙坪坝区| 枣庄市| 塔城市| 定襄县| 留坝县| 年辖:市辖区| 衡阳县| 长岭县| 晋城| 兰西县| 呈贡县| 白水县| 辽阳县| 辉县市| 姚安县| 忻州市| 昭平县| 陆良县| 江孜县| 河间市| 句容市| 浠水县| 安龙县| 冀州市| 阿拉尔市| 额尔古纳市|