官术网_书友最值得收藏!

Using the wizard to create a box plot chart

With a simple data set, we want to see the median (or mean) values and different percentile values across the whole data set. But quite often, we want to look for a particular dimension (for example, Month), at the median and percentiles of the totals for another dimension (for example, Country). So, rather than the median for the individual values (say Sales), which could be quite small or quite large, we want to see the median for the total value by the second dimension.

We can create this manually, but this can be achieved quickly using the Box Plot Wizard.

Getting ready

Load the following script:

LOAD * INLINE [
    Country, Value, Month
    USA, 12, 2013-01-01
    USA, 14.5, 2013-01-01
    USA, 6.6, 2013-02-01
    USA, 4.5, 2013-02-01
    USA, 7.8, 2013-03-01
    USA, 9.4, 2013-03-01
    UK, 11.3, 2013-01-01
    UK, 10.1, 2013-01-01
    UK, 3.2, 2013-02-01
    UK, 5.6, 2013-02-01
    UK, 3.9, 2013-03-01
    UK, 6.9, 2013-03-01
];

How to do it…

Use the following steps to create a box plot using the wizard:

  1. From the Tools menu, select Box Plot Wizard….
  2. On the introductory page, click on Next:
  3. Select the following values:
  4. Click on Next.
  5. Select the following values:
  6. Click on Finish.

How it works…

The wizard takes care of creating the expressions that will be needed for this box plot. In this case, where there is an "aggregator"; that dimension is used as part of an Aggr expression.

There are two approaches to the box plot that can be achieved from the wizard:

  • Median mode
    • The central line is defined by the median (fiftieth percentile)
    • Top of the box is the seventy-fifth percentile
    • Bottom is the seventy-fifth percentile
    • Upper whisker is the maximum value
    • Lower whisker is the minimum value
  • Average mode
    • The central line is defined by the mean value
    • Top of the box is the Mean + the Standard Deviation
    • Bottom of the box is the Mean – the Standard Deviation
    Note

    Min/Max Whiskers may not be valid as they could lie inside the box!

There's more…

This recipe uses the AGGR function to see the average/median values of an aggregation. However, the box plot can be used for more than averages.

See also

  • The Creating a box plot chart for a simple data set recipe
主站蜘蛛池模板: 平利县| 赤城县| 精河县| 安陆市| 九龙坡区| 开阳县| 阳山县| 阜新| 开化县| 南皮县| 泌阳县| 镇江市| 延吉市| 扎赉特旗| 册亨县| 济阳县| 沁阳市| 绩溪县| 上林县| 肥东县| 榆树市| 神木县| 巨野县| 永善县| 遵义县| 小金县| 潞西市| 新绛县| 阿荣旗| 永仁县| 临桂县| 新巴尔虎左旗| 多伦县| 祁阳县| 石嘴山市| 武汉市| 西昌市| 兴文县| 鄂伦春自治旗| 鄢陵县| 大姚县|