- Python Machine Learning By Example
- Yuxi (Hayden) Liu
- 136字
- 2021-07-02 12:41:35
Scaling
Values of different features can differ by orders of magnitude. Sometimes, this may mean that the larger values dominate the smaller values. This depends on the algorithm we're using. For certain algorithms to work properly, we're required to scale the data.
There are following several common strategies that we can apply:
- Standardization removes the mean of a feature and divides by the standard deviation. If the feature values are normally distributed, we'll get a Gaussian, which is centered around zero with a variance of one.
- If the feature values aren't normally distributed, we can remove the median and divide by the interquartile range. The interquartile range is a range between the first and third quartile (or 25th and 75th percentile).
- Scaling features to a range is a common choice of range between zero and one.
推薦閱讀
- Hands-On Deep Learning with Apache Spark
- Google Cloud Platform Cookbook
- 工業機器人現場編程(FANUC)
- 現代傳感技術
- 從零開始學SQL Server
- 計算機組成與操作系統
- Unreal Development Kit Game Design Cookbook
- Creating ELearning Games with Unity
- 貫通Java Web輕量級應用開發
- PowerPoint 2010幻燈片制作高手速成
- Eclipse全程指南
- PowerPoint 2003中文演示文稿5日通
- ARM? Cortex? M4 Cookbook
- Building Impressive Presentations with Impress.js
- 自動化生產線組建與調試(第2版):以亞龍YL-335B為例(三菱PLC版本)