官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 壤塘县| 上饶市| 临沧市| 桃园市| 巴东县| 上饶市| 肇庆市| 海兴县| 渝北区| 金湖县| 巴塘县| 基隆市| 大丰市| 平顶山市| 四平市| 淮滨县| 大港区| 松溪县| 澄迈县| 梁河县| 阿克苏市| 开阳县| 皋兰县| 察隅县| 家居| 石家庄市| 佳木斯市| 东台市| 饶阳县| 陇川县| 蒙阴县| 漯河市| 讷河市| 新竹市| 东台市| 巫溪县| 安龙县| 门头沟区| 天全县| 神农架林区| 扬中市|