官术网_书友最值得收藏!

Inspection

Once you have acquired your data, the next step is to inspect it. The primary goal at this stage is to sanity check the data, and the best way to accomplish this is to look for things that are either impossible or highly unlikely. As an example, if the data has a unique identifier, check to see that there is indeed only one; if the data is price-based, check that it is always positive; and whatever the data type, check the most extreme cases. Do they make sense? A good practice is to run some simple statistical tests on the data, and visualize it. The outcome of your models is only as good as the data you put in, so it is crucial to get this step right.

主站蜘蛛池模板: 保靖县| 朝阳市| 九台市| 台南县| 句容市| 望奎县| 长子县| 祁阳县| 巨鹿县| 梅州市| 道孚县| 霍城县| 江门市| 壶关县| 镶黄旗| 铁岭市| 大足县| 潜江市| 桦甸市| 顺平县| 大足县| 张家川| 夹江县| 靖宇县| 泸定县| 集安市| 孝昌县| 东丽区| 泰宁县| 田东县| 德昌县| 客服| 手游| 会宁县| 谢通门县| 顺昌县| 龙口市| 泰州市| 策勒县| 平度市| 明水县|