官术网_书友最值得收藏!

The significance of EDA

Different fields of science, economics, engineering, and marketing accumulate and store data primarily in electronic databases. Appropriate and well-established decisions should be made using the data collected. It is practically impossible to make sense of datasets containing more than a handful of data points without the help of computer programs. To be certain of the insights that the collected data provides and to make further decisions, data mining is performed where we go through distinctive analysis processes. Exploratory data analysis is key, and usually the first exercise in data mining. It allows us to visualize data to understand it as well as to create hypotheses for further analysis. The exploratory analysis centers around creating a synopsis of data or insights for the next steps in a data mining project.

EDA actually reveals ground truth about the content without making any underlying assumptions. This is the fact that data scientists use this process to actually understand what type of modeling and hypotheses can be created. Key components of exploratory data analysis include summarizing data, statistical analysis, and visualization of data. Python provides expert tools for exploratory analysis, with pandas for summarizing; scipy, along with others, for statistical analysis; and matplotlib and plotly for visualizations.

That makes sense, right? Of course it does. That is one of the reasons why you are going through this book. After understanding the significance of EDA, let's discover what are the most generic steps involved in EDA in the next section.

主站蜘蛛池模板: 巩义市| 泗洪县| 都匀市| 洞头县| 界首市| 黑水县| 嘉峪关市| 阿合奇县| 伊金霍洛旗| 湘潭市| 赞皇县| 蚌埠市| 镇康县| 赤城县| 扶沟县| 潞西市| 津市市| 祥云县| 酉阳| 吴江市| 安吉县| 象山县| 泸定县| 泰来县| 和田市| 高尔夫| 田东县| 浦北县| 龙海市| 岳西县| 扎赉特旗| 扶余县| 屏东市| 赞皇县| 德令哈市| 瑞安市| 开平市| 勃利县| 两当县| 东方市| 宜城市|