書名： Frank Kane's Taming Big Data with Apache Spark and Python
作者名： Frank Kane
本章字數： 230字
更新時間： 2021-07-02 21:12:13

What this book covers

Chapter 1, Getting Started with Spark, covers basic installation instructions for Spark and its related software. This chapter illustrates a simple example of data analysis of real movie ratings data provided by different sets of people.

Chapter 2, Spark Basics and Simple Examples, provides a brief overview of what Spark is all about, who uses it, how it helps in analyzing big data, and why it is so popular.

Chapter3, Advanced Examples of Spark Programs, illustrates some advanced and complicated examples with Spark.

Chapter 4, Running Spark on a Cluster, talks about Spark Core, covering the things you can do with Spark, such as running Spark in the cloud on a cluster, analyzing a real cluster in the cloud using Spark, and so on.

Chapter 5, SparkSQL, DataFrames, and DataSets, introduces SparkSQL, which is an important concept of Spark, and explains how to deal with structured data formats using this.

Chapter 6, Other Spark Technologies and Libraries, talks about MLlib (Machine Learning library), which is very helpful if you want to work on data mining or machine learning-related jobs with Spark. This chapter also covers Spark Streaming and GraphX; technologies built on top of Spark.

Chapter 7, Where to Go From Here? - Learning More About Spark and Data Science, talks about some books related to Spark if the readers want to know more on this topic.

官术网_书友最值得收藏!

Frank Kane's Taming Big Data with Apache Spark and Python

What this book covers