- Python Data Analysis Cookbook
- Ivan Idris
- 202字
- 2021-07-14 11:05:47
Introduction
In the real world, data rarely matches textbook definitions and examples. We have to deal with issues such as faulty hardware, uncooperative customers, and disgruntled colleagues. It is difficult to predict what kind of issues you will run into, but it is safe to assume that they will be plentiful and challenging. In this chapter, I will sketch some common approaches to deal with noisy data, which are based more on rules of thumb than strict science. Luckily, the trial and error part of data analysis is limited.
Most of this chapter is about outlier management. Outliers are values that we consider to be abnormal. Of course, this is not the only issue that you will encounter, but it is a sneaky one. A common issue is that of missing or invalid values, so I will briefly mention masked arrays and pandas features such as the dropna()
function, which I have used throughout this book.
I have also written two recipes about using mpmath for arbitrary precision calculations. I don't recommend using mpmath unless you really have to because of the performance penalty you have to pay. Usually we can work around numerical issues, so arbitrary precision libraries are rarely needed.
- 計(jì)算機(jī)網(wǎng)絡(luò)
- Python科學(xué)計(jì)算(第2版)
- PostgreSQL for Data Architects
- NLTK基礎(chǔ)教程:用NLTK和Python庫構(gòu)建機(jī)器學(xué)習(xí)應(yīng)用
- C# 從入門到項(xiàng)目實(shí)踐(超值版)
- Learning SAP Analytics Cloud
- Mastering Python Networking
- ASP.NET 3.5程序設(shè)計(jì)與項(xiàng)目實(shí)踐
- Linux命令行與shell腳本編程大全(第4版)
- INSTANT Passbook App Development for iOS How-to
- RISC-V體系結(jié)構(gòu)編程與實(shí)踐(第2版)
- 蘋果的產(chǎn)品設(shè)計(jì)之道:創(chuàng)建優(yōu)秀產(chǎn)品、服務(wù)和用戶體驗(yàn)的七個原則
- Natural Language Processing with Java and LingPipe Cookbook
- Microsoft Exchange Server 2016 PowerShell Cookbook(Fourth Edition)
- Java網(wǎng)絡(luò)編程實(shí)用精解