官术网_书友最值得收藏!

Exploratory Data Analysis Fundamentals

The main objective of this introductory chapter is to revise the fundamentals of Exploratory Data Analysis (EDA), what it is, the key concepts of profiling and quality assessment, the main dimensions of EDA, and the main challenges and opportunities in EDA.  

Data encompasses a collection of discrete objects, numbers, words, events, facts, measurements, observations, or even descriptions of things. Such data is collected and stored by every event or process occurring in several disciplines, including biology, economics, engineering, marketing, and others. Processing such data elicits useful information and processing such information generates useful knowledge. But an important question is: how can we generate meaningful and useful information from such data? An answer to this question is EDA. EDA is a process of examining the available dataset to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical measures. In this chapter, we are going to discuss the steps involved in performing top-notch exploratory data analysis and get our hands dirty using some open source databases.

As mentioned here and in several studies, the primary aim of EDA is to examine what data can tell us before actually going through formal modeling or hypothesis formulation. John Tuckey promoted EDA to statisticians to examine and discover the data and create newer hypotheses that could be used for the development of a newer approach in data collection and experimentations. 

In this chapter, we are going to learn and revise the following topics:

Understanding data science

The significance of EDA

Making sense of data

Comparing EDA with classical and Bayesian analysis

Software tools available for EDA

Getting started with EDA

主站蜘蛛池模板: 黄梅县| 兰溪市| 无锡市| 麻城市| 桓台县| 连江县| 丰都县| 辉县市| 双峰县| 乐亭县| 温州市| 荥经县| 延庆县| 建水县| 安义县| 安吉县| 神农架林区| 永平县| 许昌市| 肇源县| 思南县| 普兰县| 彩票| 清丰县| 崇礼县| 河津市| 西林县| 宜兰县| 琼中| 镇康县| 武隆县| 新巴尔虎右旗| 上栗县| 孙吴县| 瑞丽市| 宜君县| 屏山县| 大城县| 天等县| 舒城县| 句容市|