官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 高台县| 吉木乃县| 安陆市| 晋城| 保德县| 博野县| 博客| 上虞市| 天长市| 常熟市| 广灵县| 类乌齐县| 乌鲁木齐市| 惠来县| 樟树市| 新竹县| 旺苍县| 广德县| 体育| 双峰县| 安泽县| 当阳市| 南昌县| 华池县| 修武县| 子长县| 南康市| 荔波县| 惠水县| 措勤县| 荆州市| 陆河县| 贡嘎县| 柏乡县| 嘉定区| 隆昌县| 永新县| 昌乐县| 绥滨县| 揭西县| 禄丰县|