官术网_书友最值得收藏!

Chapter 3. Data Preprocessing

Real-world observations are usually noisy and inconsistent, with missing data. No classification, regression, or clustering model can extract reliable information from data that has not been cleansed, filtered, or analyzed.

Data preprocessing consists of cleaning, filtering, transforming, and normalizing raw observations using statistics in order to correlate features or groups of features, identify trends, model, and filter out noise. The purpose of cleansing raw data is twofold:

  • Identify flaws in raw input data
  • Provide unsupervised or supervised learning with a clean and reliable dataset

You should not underestimate the power of traditional statistical analysis methods to infer and classify information from textual or unstructured data.

In this chapter, you will learn how to to the following:

  • Apply commonly used moving average techniques to detect long-term trends in a time series
  • Identify market and sector cycles using the discrete Fourier series
  • Leverage the discrete Kalman filter to extract the state of a linear dynamic system from incomplete and noisy observations
主站蜘蛛池模板: 齐河县| 迁西县| 习水县| 综艺| 嘉峪关市| 平昌县| 海兴县| 雷山县| 绥芬河市| 洪洞县| 洪江市| 林西县| 皋兰县| 诏安县| 郯城县| 威宁| 荆门市| 临汾市| 临颍县| 合山市| 大化| 平江县| 开平市| 彭州市| 杂多县| 津南区| 泊头市| 阿城市| 拜泉县| 嘉义市| 丰原市| 涟源市| 全椒县| 克东县| 甘德县| 天长市| 乐安县| 卢湾区| 蓬溪县| 余干县| 达拉特旗|