官术网_书友最值得收藏!

Binning

Sometimes, it's useful to separate feature values into several bins. For example, we may be only interested whether it rained on a particular day. Given the precipitation values, we can binarize the values, so that we get a true value if the precipitation value isn't zero and a false value otherwise. We can also use statistics to divide values into high, low, and medium bins. In marketing, we often care more about the age group, such as 18 to 24, than a specific age such as 23.

The binning process inevitably leads to loss of information. However, depending on your goals, this may not be an issue, and actually reduces the chance of overfitting. Certainly, there will be improvements in speed and reduction of memory or storage requirements and redundancy.

主站蜘蛛池模板: 凌云县| 鲜城| 加查县| 罗定市| 石嘴山市| 唐山市| 大荔县| 安庆市| 淮北市| 扶绥县| 丹巴县| 谷城县| 乐山市| 卢氏县| 东宁县| 米泉市| 荆门市| 小金县| 沁源县| 山东| 务川| 沙洋县| 中江县| 若羌县| 怀仁县| 兴化市| 思南县| 清徐县| 德安县| 会理县| 中超| 宝清县| 达州市| 阳泉市| 会宁县| 三台县| 方山县| 大关县| 吴旗县| 广饶县| 江永县|