- Data Visualization:a successful design process
- Andy Kirk
- 403字
- 2021-08-05 18:15:37
Visualization as a discovery tool
One of the most compelling arguments for the value of data visualization is expressed in this quote from John W Tukey (Exploratory Data Analysis).
The greatest value of a picture is when it forces us to notice what we never expected to see.
Through visualization, we are seeking to portray data in ways that allow us to see it in a new light, to visually observe patterns, exceptions, and the possible stories that sit behind its raw state. This is about considering visualization as a tool for discovery.
A well known demonstration that supports this notion was developed by noted statistician Francis Anscombe (incidentally, brother-in-law to Tukey) in the 1970s. He compiled an experiment involving four sets of data, each exhibiting almost identical statistical properties including mean, variance, and correlation. This was known as "Anscombe's quartet".

Sample data sets recreated from Anscombe, Francis J. (1973) Graphs in statistical analysis. American Statistician, 27, 17–21
Ask yourself, what can you see in these sets of data? Do any patterns or trends jump out? Perhaps the sequence of eights in the fourth set? Otherwise there's nothing much of interest evident.
So what if we now visualize this data, what can we see then?

Image published under the terms of "Creative Commons Attribution-Share Alike", source: http://commons.wikimedia.org/wiki/File:Anscombe%27s_quartet_3.svg
Through the previous graphical display, we can immediately see the prominent patterns created by the relationships between the X and Y values across the four sets of data as follows:
- the general tendency about a trend line in X1, Y1
- the curvature pattern of X2, Y2
- the strong linear pattern with single outlier in X3, Y3
- the similarly strong linear pattern with an outlier for X4, Y4
The intention and value of Anscombe's experiment was to demonstrate the importance of presenting data graphically. Rather than just describing a dataset based on a selection of some of its key statistical properties alone, to make proper sense of data, and avoid forming false conclusions we need to also employ visualization techniques.
It is much easier to discover and confirm the presence (or even absence) of patterns, relationships, and physical characteristics (such as outliers) through a visual display, reinforcing the essence of Tukey's quote about the value of pictures.
Data visualization is about a discovery process, enabling the reader to move from just looking at data to actually seeing it. This is a subtle but important distinction.
- 幾何原本
- 普林斯頓微積分讀本(修訂版)
- 西去東來:沿絲綢之路數(shù)學(xué)知識的傳播與交流
- 線性代數(shù)及其應(yīng)用(原書第6版)
- Advanced Blockchain Development
- 數(shù)學(xué)可以很有趣:科學(xué)新悅讀文叢(套裝全5冊)
- Ethereum Smart Contract Development
- 數(shù)學(xué)的雨傘下:理解世界的樂趣
- 現(xiàn)代數(shù)值計算(第2版)
- 數(shù)學(xué)與決策:數(shù)學(xué)教你做決定
- 數(shù)學(xué)的力量
- Blockchain for Decision Makers
- 跟愛因斯坦一起玩數(shù)學(xué)(進階篇)
- 排序問題的數(shù)學(xué)規(guī)劃松弛方法
- 人大附小的課堂四聲(人大附小七彩教育成果叢書)