書名： Learning Spark SQL
作者名： Aurobindo Sarkar
本章字數： 113字
更新時間： 2021-07-02 18:23:45

Using Spark SQL for Data Exploration

In this chapter, we will introduce you to using Spark SQL for exploratory data analysis. We will introduce preliminary techniques to compute some basic statistics, identify outliers, and visualize, sample, and pivot data. A series of hands-on exercises in this chapter will enable you to use Spark SQL along with tools such as Apache Zeppelin for developing an intuition about your data.

In this chapter, we shall look at the following topics:

What is Exploratory Data Analysis (EDA)
Why is EDA important?
Using Spark SQL for basic data analysis
Visualizing data with Apache Zeppelin
Sampling data with Spark SQL APIs
Using Spark SQL for creating pivot tables

官术网_书友最值得收藏!

Learning Spark SQL

Using Spark SQL for Data Exploration