官术网_书友最值得收藏!

Inspection

Once you have acquired your data, the next step is to inspect it. The primary goal at this stage is to sanity check the data, and the best way to accomplish this is to look for things that are either impossible or highly unlikely. As an example, if the data has a unique identifier, check to see that there is indeed only one; if the data is price-based, check that it is always positive; and whatever the data type, check the most extreme cases. Do they make sense? A good practice is to run some simple statistical tests on the data, and visualize it. The outcome of your models is only as good as the data you put in, so it is crucial to get this step right.

主站蜘蛛池模板: 高淳县| 永胜县| 融水| 南漳县| 孟连| 高平市| 五台县| 汤阴县| 莱芜市| 长泰县| 托克逊县| 尼玛县| 诏安县| 井研县| 牡丹江市| 香格里拉县| 清新县| 松原市| 新巴尔虎右旗| 广饶县| 浦城县| 丰顺县| 镇巴县| 承德市| 乐清市| 西平县| 镇康县| 林芝县| 从江县| 绥芬河市| 射洪县| 筠连县| 佛坪县| 洱源县| 香格里拉县| 松桃| 彰武县| 农安县| 沈阳市| 五家渠市| 谢通门县|