官术网_书友最值得收藏!

The benefits of EDA across vertical markets

Every organization today produces and relies on a lot of data in their everyday processes. Before making assumptions and decisions based on this data, organizations need to be able to understand it. EDA enables data analysts and data scientists to bring this information to the right people. It is the most important step on which a data-driven organization should focus its energy and resources.

Having practical tools in hand for carrying out EDA helps data analysts and data scientists produce reproducible and knowledgeable data analysis results. R is one of the most popular data analysis environments, so it makes sense to equip your data analysis teams with powerful R techniques to make the most of their EDA skills.

At the time of writing this book, there are more than 13,000 R packages available according to CRAN. You can get R packages for all kinds of tasks and domains. For our purpose, we will be concentrating on a particular set of R packages that are considered the best by the R community for the purpose of EDA. Some of the packages that we are going to cover may not be directly related to EDA, but they are relevant for other stages of dealing with the data, as indicated by the following diagram:

We will introduce these packages briefly in this chapter and go into more detail as the book progresses. The different stages are as mentioned as follows:

  • Pre Modeling Stage: This stage involves the manipulation of the data frame based on Data Visualization, Data Transformation, Missing Value Imputations, Outlier Detection, Feature Selection, and Dimension Reduction.
  • Modeling Stage: This stage is considered as an intermediate stage that involves Continuous Regression, Ordinal Regression, Classification, Clustering, and Time Series with Survival.
  • Post Modeling Stage: This stage is considered as a final stage where only output interpretation is considered on high priority. It includes the implementation of various algorithms such as clustering, classification, and regression.
主站蜘蛛池模板: 宜宾市| 班玛县| 县级市| 德阳市| 陈巴尔虎旗| 那坡县| 仙居县| 金门县| 察雅县| 吉林省| 永年县| 永善县| 高阳县| 贡山| 贵南县| 伽师县| 横山县| 洱源县| 竹山县| 新沂市| 宕昌县| 旅游| 修文县| 南京市| 洛浦县| 镇原县| 曲麻莱县| 积石山| 和龙市| 商南县| 襄城县| 乌拉特后旗| 丹巴县| 丹棱县| 大名县| 都匀市| 宁明县| 蓬安县| 榆林市| 阿图什市| 江北区|