- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 143字
- 2021-07-02 22:57:19
Scaling
Values of different features can differ by orders of magnitude. Sometimes this may mean that the larger values dominate the smaller values. This depends on the algorithm we are using. For certain algorithms to work properly we are required to scale the data. There are several common strategies that we can apply:
- Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we will get a Gaussian, which is centered around zero with a variance of one.
- If the feature values are not normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
- Scaling features to a range is a common choice of range which is a range between zero and one.
推薦閱讀
- JavaScript百煉成仙
- iOS面試一戰(zhàn)到底
- Mastering Kotlin
- Java游戲服務(wù)器架構(gòu)實戰(zhàn)
- 程序員考試案例梳理、真題透解與強化訓(xùn)練
- Animate CC二維動畫設(shè)計與制作(微課版)
- Visual C++應(yīng)用開發(fā)
- Node.js全程實例
- Bootstrap 4 Cookbook
- 從Excel到Python數(shù)據(jù)分析:Pandas、xlwings、openpyxl、Matplotlib的交互與應(yīng)用
- 奔跑吧 Linux內(nèi)核
- HTML5 Canvas核心技術(shù):圖形、動畫與游戲開發(fā)
- JBoss AS 7 Development
- Serverless工程實踐:從入門到進階
- Java Script從入門到精通(第5版)