官术网_书友最值得收藏!

  • PySpark Cookbook
  • Denny Lee Tomasz Drabas
  • 160字
  • 2021-06-18 19:06:24

What this book covers

Chapter 1, Installing and Configuring Spark, shows us how to install and configure Spark, either as a local instance, as a multi-node cluster, or in a virtual environment.

Chapter 2, Abstracting Data with RDDs, covers how to work with Apache Spark Resilient Distributed Datasets (RDDs).

Chapter 3, Abstracting Data with DataFrames, explores the current fundamental data structure—DataFrames.

Chapter 4, Preparing Data for Modeling, covers how to clean up your data and prepare it for modeling.

Chapter 5, Machine Learning with MLlib, shows how to build machine learning models with PySpark's MLlib module.

Chapter 6, Machine Learning with the ML Module, moves on to the currently supported machine learning module of PySpark—the ML module.

Chapter 7, Structured Streaming with PySpark, covers how to work with Apache Spark structured streaming within PySpark.

Chapter 8, GraphFrames – Graph Theory with PySpark, shows how to work with GraphFrames for Apache Spark.

主站蜘蛛池模板: 缙云县| 永昌县| 南投县| 巴彦县| 县级市| 保定市| 眉山市| 东城区| 新巴尔虎右旗| 南澳县| 南充市| 崇文区| 涞源县| 乌审旗| 明溪县| 石首市| 阳江市| 邹平县| 六安市| 公主岭市| 固阳县| 安多县| 广南县| 奉节县| 赤壁市| 盱眙县| 英吉沙县| 铜川市| 三门峡市| 卫辉市| 清徐县| 思茅市| 乐清市| 石柱| 富源县| 建平县| 陇川县| 逊克县| 邓州市| 拉孜县| 武乡县|