官术网_书友最值得收藏!

Using Spark SQL for Data Exploration

In this chapter, we will introduce you to using Spark SQL for exploratory data analysis. We will introduce preliminary techniques to compute some basic statistics, identify outliers, and visualize, sample, and pivot data. A series of hands-on exercises in this chapter will enable you to use Spark SQL along with tools such as Apache Zeppelin for developing an intuition about your data.

In this chapter, we shall look at the following topics:

  • What is Exploratory Data Analysis (EDA)
  • Why is EDA important?
  • Using Spark SQL for basic data analysis
  • Visualizing data with Apache Zeppelin
  • Sampling data with Spark SQL APIs
  • Using Spark SQL for creating pivot tables
主站蜘蛛池模板: 习水县| 普兰店市| 永顺县| 合作市| 谷城县| 江山市| 板桥市| 玉林市| 洪洞县| 六枝特区| 峨眉山市| 四平市| 五指山市| 镇沅| 景德镇市| 祁连县| 桦甸市| 乐安县| 汶川县| 西盟| 儋州市| 嘉峪关市| 东乡| 南投市| 镇远县| 日喀则市| 陇南市| 耿马| 宜君县| 绿春县| 禹城市| 来宾市| 辰溪县| 文登市| 定襄县| 剑阁县| 礼泉县| 布拖县| 兴业县| 和龙市| 嘉祥县|