官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 河南省| 木里| 禄丰县| 红安县| 宜阳县| 农安县| 封开县| 古浪县| 永康市| 揭东县| 丰城市| 开江县| 顺昌县| 腾冲县| 西丰县| 兴山县| 洛扎县| 巨鹿县| 长葛市| 申扎县| 襄汾县| 七台河市| 喀什市| 西乌珠穆沁旗| 井陉县| 上饶县| 九龙坡区| 泸定县| 伊宁县| 沙田区| 微博| 汽车| 东乡族自治县| 香河县| 奇台县| 自治县| 汨罗市| 锦屏县| 岳阳市| 津南区| 获嘉县|