官术网_书友最值得收藏!

  • Machine Learning with Swift
  • Alexander Sosnovshchenko
  • 154字
  • 2021-06-24 18:54:51

Data preprocessing

The useful information in the data is usually referred to as a signal. On the other hand, the pieces of data that represent errors of different kinds and irrelevant data are known as noise. Errors can occur in the data during measurements, information transmission, or due to human errors. The goal of data cleansing procedures is to increase the signal/noise ratio. During this stage, you will usually transform all data to one format, delete entries with missed values, and check suspicious outliers (they can be both noise and signal). It is widely believed among ML engineers, that the data preprocessing stage usually consumes 90% of the time allocated for the ML project. Then, algorithm tweaking consumes another 90% of time. This statement is a joke only partially (about 10% of it). In Chapter 13Best Practices, we are going to discuss common problems with the data and how to fix them.

主站蜘蛛池模板: 公安县| 拜泉县| 施秉县| 阳东县| 旅游| 博湖县| 长治市| 石阡县| 高台县| 铜鼓县| 朝阳区| 玉树县| 通河县| 盈江县| 峨边| 岳阳县| 宁强县| 东阳市| 安庆市| 晋江市| 北辰区| 大安市| 原平市| 隆尧县| 铁力市| 商城县| 尉氏县| 嘉兴市| 泗阳县| 汽车| 陆丰市| 新兴县| 天峨县| 深泽县| 清水县| 新蔡县| 来宾市| 韶关市| 天门市| 吉林省| 筠连县|