官术网_书友最值得收藏!

Scaling

Values of different features can differ by orders of magnitude. Sometimes this may mean that the larger values dominate the smaller values. This depends on the algorithm we are using. For certain algorithms to work properly we are required to scale the data. There are several common strategies that we can apply:

  • Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we will get a Gaussian, which is centered around zero with a variance of one.
  • If the feature values are not normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
  • Scaling features to a range is a common choice of range which is a range between zero and one.
主站蜘蛛池模板: 芦山县| 宜兰市| 建平县| 怀安县| 达州市| 新乡县| 清苑县| 林州市| 石泉县| 浮山县| 揭西县| 科技| 民县| 团风县| 普安县| 聂荣县| 嘉义县| 扎囊县| 襄汾县| 新宾| 万宁市| 汾西县| 大城县| 芒康县| 瑞安市| 冀州市| 蓬安县| 龙川县| 遂平县| 和林格尔县| 陕西省| 彩票| 乌什县| 沾化县| 舒兰市| 榕江县| 澄迈县| 财经| 隆尧县| 邹平县| 随州市|