- Hands-On Exploratory Data Analysis with Python
- Suresh Kumar Mukhiya Usman Ahmed
- 341字
- 2021-06-24 16:44:47
Steps in EDA
Having understood what EDA is, and its significance, let's understand the various steps involved in data analysis. Basically, it involves four different steps. Let's go through each of them to get a brief understanding of each step:
Problem definition: Before trying to extract useful insight from the data, it is essential to define the business problem to be solved. The problem definition works as the driving force for a data analysis plan execution. The main tasks involved in problem definition are defining the main objective of the analysis, defining the main deliverables, outlining the main roles and responsibilities, obtaining the current status of the data, defining the timetable, and performing cost/benefit analysis. Based on such a problem definition, an execution plan can be created.
Data preparation: This step involves methods for preparing the dataset before actual analysis. In this step, we define the sources of data, define data schemas and tables, understand the main characteristics of the data, clean the dataset, delete non-relevant datasets, transform the data, and divide the data into required chunks for analysis.
Data analysis: This is one of the most crucial steps that deals with descriptive statistics and analysis of the data. The main tasks involve summarizing the data, finding the hidden correlation and relationships among the data, developing predictive models, evaluating the models, and calculating the accuracies. Some of the techniques used for data summarization are summary tables, graphs, descriptive statistics, inferential statistics, correlation statistics, searching, grouping, and mathematical models.
Development and representation of the results: This step involves presenting the dataset to the target audience in the form of graphs, summary tables, maps, and diagrams. This is also an essential step as the result analyzed from the dataset should be interpretable by the business stakeholders, which is one of the major goals of EDA. Most of the graphical analysis techniques include scattering plots, character plots, histograms, box plots, residual plots, mean plots, and others. We will explore several types of graphical representation in Chapter 2, Visual Aids for EDA.
- HTML5+CSS3+JavaScript從入門到精通:上冊(微課精編版·第2版)
- C#高級編程(第10版) C# 6 & .NET Core 1.0 (.NET開發經典名著)
- 摩登創客:與智能手機和平板電腦共舞
- Hands-On Image Processing with Python
- 碼上行動:零基礎學會Python編程(ChatGPT版)
- Apache Spark Graph Processing
- Kinect for Windows SDK Programming Guide
- Learning OpenStack Networking(Neutron)
- Python Web數據分析可視化:基于Django框架的開發實戰
- 一本書講透Java線程:原理與實踐
- Node.js開發指南
- Go語言開發實戰(慕課版)
- Java Web應用開發項目教程
- 大學計算機基礎實訓教程
- Software-Defined Networking with OpenFlow(Second Edition)