- Machine Learning for Developers
- Rodolfo Bonnin
- 169字
- 2021-07-02 15:46:52
Imputation of missing data
When dealing with not-so-perfect or incomplete datasets, a missing register may not add value to the model in itself, but all the other elements of the row could be useful to the model. This is especially true when the model has a high percentage of incomplete values, so no row can be discarded.
The main question in this process is "how do you interpret a missing value?" There are many ways, and they usually depend on the problem itself.
A very naive approach could be set the value to zero, supposing that the mean of the data distribution is 0. An improved step could be to relate the missing data with the surrounding content, assigning the average of the whole column, or an interval of n elements of the same columns. Another option is to use the column's median or most frequent value.
Additionally, there are more advanced techniques, such as robust methods and even k-nearest neighbors, that we won't cover in this book.
- Python Geospatial Development(Second Edition)
- MATLAB實用教程
- 程序設(shè)計基礎(chǔ)教程:C語言
- Mastering JavaScript Design Patterns(Second Edition)
- C語言程序設(shè)計
- UVM實戰(zhàn)
- CoffeeScript Application Development Cookbook
- 新一代SDN:VMware NSX 網(wǎng)絡(luò)原理與實踐
- 響應(yīng)式Web設(shè)計:HTML5和CSS3實戰(zhàn)(第2版)
- OpenCV Android Programming By Example
- 高效使用Greenplum:入門、進階與數(shù)據(jù)中臺
- 寫給青少年的人工智能(Python版·微課視頻版)
- 大規(guī)模語言模型開發(fā)基礎(chǔ)與實踐
- Visual Basic語言程序設(shè)計上機指導(dǎo)與練習(xí)(第3版)
- AngularJS Web Application Development Cookbook