- Machine Learning for Developers
- Rodolfo Bonnin
- 148字
- 2021-07-02 15:46:52
Dataset preprocessing
When we first dive into data science, a common mistake is expecting all the data to be very polished and with good characteristics from the very beginning. Alas, that is not the case for a very considerable percentage of cases, for many reasons such as null data, sensor errors that cause outliers and NAN, faulty registers, instrument-induced bias, and all kinds of defects that lead to poor model fitting and that must be eradicated.
The two key processes in this stage are data normalization and feature scaling. This process consists of applying simple transformations called affine that map the current unbalanced data into a more manageable shape, maintaining its integrity but providing better stochastic properties and improving the future applied model. The common goal of the standardization techniques is to bring the data distribution closer to a normal distribution, with the following techniques:
- 數字媒體應用教程
- Testing with JUnit
- Microsoft Dynamics 365 Extensions Cookbook
- JMeter 性能測試實戰(第2版)
- Three.js開發指南:基于WebGL和HTML5在網頁上渲染3D圖形和動畫(原書第3版)
- R語言數據可視化之美:專業圖表繪制指南
- PostgreSQL Replication(Second Edition)
- HTML5從入門到精通 (第2版)
- C專家編程
- Django 3.0入門與實踐
- 深度學習原理與PyTorch實戰(第2版)
- Machine Learning With Go
- Red Hat Enterprise Linux Troubleshooting Guide
- Qt5 C++ GUI Programming Cookbook
- Clojure for Java Developers