官术网_书友最值得收藏!

Outliers

An outlier is an observation that lies an unusual distance from other observations. There is a judgmental element in deciding what is considered unusual, and it helps to work with the subject-matter expert in deciding this. In exploratory data analysis, there are two activities that are linked:

  • Examining the overall shape of the graphed data for important features
  • Examining the data for unusual observations that are far from the mass or general trend of the data

Outliers are data points that deserve a closer look. The values could be real data values accurately recorded or the values could be misrecorded or otherwise flawed data. You need to discern what is the case in your situation and decide what action to take.

In this section, we consider statistical and graphical ways of summarizing the distribution of a variable and detecting unusual/extreme values. IBM SPSS Statistics provides many tools for this, which are found in procedures such as Frequencies, Examine, and Chart Builder. To explore these facilities, we introduce data on used Toyota Corollas and, in particular, look at the distribution of the offer prices, in Euros, of sales in the Netherlands in the year 2004. 

The Toyota Corolla data featured in this chapter is described in  Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner(R) , Third Edition. Galit Shmueli, Peter C. Bruce, and Nitin R. Patel. Copyright 2016 John Wiley and Sons.
主站蜘蛛池模板: 龙山县| 广平县| 阿鲁科尔沁旗| 左权县| 观塘区| 巩留县| 佛冈县| 保靖县| 固镇县| 五台县| 英吉沙县| 吴堡县| 名山县| 金寨县| 海伦市| 加查县| 正定县| 荣成市| 辽中县| 河池市| 秦皇岛市| 温泉县| 清流县| 册亨县| 沧州市| 鄂州市| 昭觉县| 吴旗县| 隆尧县| 孟连| 启东市| 重庆市| 石嘴山市| 甘南县| 镇巴县| 平南县| 昌邑市| 聂荣县| 浪卡子县| 玛沁县| 永宁县|