官术网_书友最值得收藏!

How it works...

In this recipe, we have learned about the following concepts:

  • Imputing missing values: We have learned that one of the ways to impute the missing values of a variable is by replacing the missing values with the median of the corresponding variable. Other ways to deal with the missing values is by replacing them with the mean value, and also by replacing the missing value with the mean of the variable's value in the rows that are most similar to the row that contains a missing value (this technique is called identifying the K-Nearest Neighbours).
  • Capping the outlier values: We have also learned that one way to cap the outliers is by replacing values that are above the 95th percentile value with the 95th percentile value. The reason we performed this exercise is to ensure that the input variable does not have all the values clustered around a small value (when the variable is scaled by the maximum value, which is an outlier).
  • Scaling dataset: Finally, we scaled the dataset so that it can then be passed to a neural network.
主站蜘蛛池模板: 青州市| 泽库县| 武川县| 焦作市| 雷波县| 萨嘎县| 罗平县| 乌鲁木齐县| 上饶县| 永福县| 绥宁县| 平谷区| 武义县| 绥江县| 舒城县| 莫力| 阳西县| 准格尔旗| 崇信县| 沾化县| 九江县| 舞阳县| 克山县| 黄冈市| 抚顺市| 麻城市| 栖霞市| 安康市| 晋中市| 修水县| 明光市| 海南省| 儋州市| 星子县| 林西县| 涡阳县| 利川市| 安阳市| 昭平县| 威远县| 镇安县|