- Apache Spark 2.x for Java Developers
- Sourav Gulati Sumit Kumar
- 320字
- 2021-07-02 19:01:50
What this book covers
Chapter 1, Introduction to Spark, covers the history of big data, its dimensions, and basic concepts of Hadoop and Spark.
Chapter 2, Revisiting Java, refreshes the concepts of core Java and will focus on the newer feature of Java 8 that will be leveraged while developing Spark applications.
Chapter 3, Let Us Spark, serves the purpose of providing an instruction set so that the reader becomes familiar with installing Apache Spark in standalone mode along with its dependencies.
Chapter 4, Understanding the Spark Programming Model, makes progress by explaining the word count problem in Apache Spark using Java and simultaneously setting up an IDE.
Chapter 5, Working with Data and Storage, teaches you how to read/store data in Spark from/to different storage systems.
Chapter 6, Spark on Cluster, discusses the cluster setup process and some popular cluster managers available with Spark in detail. After this chapter, you will be able to execute Spark jobs effectively in distributed mode.
Chapter 7, Spark Programming Model – Advanced, covers partitioning concepts in RDD along with advanced transformations and actions in Spark.
Chapter 8, Working with Spark SQL, discusses Spark SQL and its related concepts such as dataframe, dataset, and UDF. We will also discuss SqlContext and the newly introduced SparkSession.
Chapter 9, Near-Real-Time Processing with Spark Streaming, covers the internals of Spark Streaming, reading streams of data in Spark from various data sources with examples, and newer extensions of stream processing in Spark known as structured streaming.
Chapter 10, Machine Learning Analytics with Spark MLlib, focuses on introducing the concepts of machine learning and then moves on towards its implementation using Apache Spark Mllib libraries. We also discuss some real-world problems using Spark Mllib.
Chapter 11, Learning Spark GraphX, looks into another module of Spark, GraphX; we will discover types of GraphX RDD and various operations associated with them. We will also discuss the use cases of GraphX implementation.
- Apache ZooKeeper Essentials
- NLTK基礎教程:用NLTK和Python庫構建機器學習應用
- Spring實戰(zhàn)(第5版)
- 概率成形編碼調制技術理論及應用
- 大模型RAG實戰(zhàn):RAG原理、應用與系統(tǒng)構建
- Python算法指南:程序員經(jīng)典算法分析與實現(xiàn)
- 現(xiàn)代C++編程實戰(zhàn):132個核心技巧示例(原書第2版)
- Hands-On Full Stack Development with Spring Boot 2.0 and React
- .NET Standard 2.0 Cookbook
- Lift Application Development Cookbook
- Kotlin極簡教程
- TypeScript 2.x By Example
- iOS開發(fā)項目化入門教程
- LabVIEW數(shù)據(jù)采集
- Getting Started with Web Components