官术网_书友最值得收藏!

Chapter 2. Exploratory Data Analysis

Exploratory data analysis is a very important topic in the field of data analysis. It is an approach of analyzing the data and summarizing the main characteristics of the dataset. The main objective of exploratory data analysis is to check various hypotheses in order to get a better understanding about the dataset.

Exploratory data analysis includes many statistical techniques and visual and nonvisual analysis. When your study has to be communicated with peers as well as with other audience with non-data science backgrounds, it is advisable to use a lot of visual techniques that help in better communications.

Some of the expectations out of exploratory data analysis are getting insights out of the data, extracting the important variables in the dataset (depending on the problem to be solved), identifying the outliers in the data, and getting results of various testing hypotheses. These results play a very important role in how to solve the business problems, and if it is a modeling problem, then deciding on which model to use and how to apply it to the dataset for enhanced accuracy.

In this chapter, you will learn how to perform exploratory data analysis starting with getting a generalized view on the data, analysis of one variable at a time, then bi-variable analysis, and finally, analyzing multiple variables to get a better understanding on interdependencies.

The topics that will be covered in this chapter are as follows:

  • Titanic dataset
  • Descriptive statistics
  • Inferential statistics
  • Univariate analysis
  • Bivariate analysis
  • Multivariate analysis (scatter plot with segments, heatmap, and tabulation)
主站蜘蛛池模板: 潼关县| 富平县| 咸丰县| 城固县| 内江市| 新邵县| 饶阳县| 环江| 福州市| 九龙县| 子洲县| 慈利县| 嘉祥县| 崇文区| 常德市| 成安县| 高安市| 定安县| 木兰县| 嘉禾县| 高陵县| 仁布县| 都兰县| 文水县| 富裕县| 丹阳市| 洪洞县| 运城市| 临西县| 沛县| 通辽市| 临江市| 清丰县| 社旗县| 如皋市| 兴文县| 景东| 平潭县| 应城市| 新巴尔虎左旗| 本溪|