官术网_书友最值得收藏!

Exploratory Data Analysis Fundamentals

The main objective of this introductory chapter is to revise the fundamentals of Exploratory Data Analysis (EDA), what it is, the key concepts of profiling and quality assessment, the main dimensions of EDA, and the main challenges and opportunities in EDA.  

Data encompasses a collection of discrete objects, numbers, words, events, facts, measurements, observations, or even descriptions of things. Such data is collected and stored by every event or process occurring in several disciplines, including biology, economics, engineering, marketing, and others. Processing such data elicits useful information and processing such information generates useful knowledge. But an important question is: how can we generate meaningful and useful information from such data? An answer to this question is EDA. EDA is a process of examining the available dataset to discover patterns, spot anomalies, test hypotheses, and check assumptions using statistical measures. In this chapter, we are going to discuss the steps involved in performing top-notch exploratory data analysis and get our hands dirty using some open source databases.

As mentioned here and in several studies, the primary aim of EDA is to examine what data can tell us before actually going through formal modeling or hypothesis formulation. John Tuckey promoted EDA to statisticians to examine and discover the data and create newer hypotheses that could be used for the development of a newer approach in data collection and experimentations. 

In this chapter, we are going to learn and revise the following topics:

Understanding data science

The significance of EDA

Making sense of data

Comparing EDA with classical and Bayesian analysis

Software tools available for EDA

Getting started with EDA

主站蜘蛛池模板: 宜州市| 澄城县| 体育| 壤塘县| 三穗县| 义马市| 井研县| 陆河县| 佛冈县| 怀化市| 德清县| 响水县| 桐乡市| 房产| 防城港市| 凤翔县| 炎陵县| 乡宁县| 贺州市| 盘山县| 犍为县| 永昌县| 新绛县| 洪雅县| 鄯善县| 临沧市| 镇赉县| 平阳县| 英德市| 澄迈县| 高州市| 温宿县| 青海省| 明光市| 奉贤区| 开鲁县| 集贤县| 南昌市| 嘉荫县| 稷山县| 苍山县|