目錄(93章)
倒序
- 封面
- 版權信息
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Preface
- Chapter 1. Getting Started with Apache Spark
- Introduction
- Installing Spark from binaries
- Building the Spark source code with Maven
- Launching Spark on Amazon EC2
- Deploying on a cluster in standalone mode
- Deploying on a cluster with Mesos
- Deploying on a cluster with YARN
- Using Tachyon as an off-heap storage layer
- Chapter 2. Developing Applications with Spark
- Introduction
- Exploring the Spark shell
- Developing Spark applications in Eclipse with Maven
- Developing Spark applications in Eclipse with SBT
- Developing a Spark application in IntelliJ IDEA with Maven
- Developing a Spark application in IntelliJ IDEA with SBT
- Chapter 3. External Data Sources
- Introduction
- Loading data from the local filesystem
- Loading data from HDFS
- Loading data from HDFS using a custom InputFormat
- Loading data from Amazon S3
- Loading data from Apache Cassandra
- Loading data from relational databases
- Chapter 4. Spark SQL
- Introduction
- Understanding the Catalyst optimizer
- Creating HiveContext
- Inferring schema using case classes
- Programmatically specifying the schema
- Loading and saving data using the Parquet format
- Loading and saving data using the JSON format
- Loading and saving data from relational databases
- Loading and saving data from an arbitrary source
- Chapter 5. Spark Streaming
- Introduction
- Word count using Streaming
- Streaming Twitter data
- Streaming using Kafka
- Chapter 6. Getting Started with Machine Learning Using MLlib
- Introduction
- Creating vectors
- Creating a labeled point
- Creating matrices
- Calculating summary statistics
- Calculating correlation
- Doing hypothesis testing
- Creating machine learning pipelines using ML
- Chapter 7. Supervised Learning with MLlib – Regression
- Introduction
- Using linear regression
- Understanding cost function
- Doing linear regression with lasso
- Doing ridge regression
- Chapter 8. Supervised Learning with MLlib – Classification
- Introduction
- Doing classification using logistic regression
- Doing binary classification using SVM
- Doing classification using decision trees
- Doing classification using Random Forests
- Doing classification using Gradient Boosted Trees
- Doing classification with Na?ve Bayes
- Chapter 9. Unsupervised Learning with MLlib
- Introduction
- Clustering using k-means
- Dimensionality reduction with principal component analysis
- Dimensionality reduction with singular value decomposition
- Chapter 10. Recommender Systems
- Introduction
- Collaborative filtering using explicit feedback
- Collaborative filtering using implicit feedback
- Chapter 11. Graph Processing Using GraphX
- Introduction
- Fundamental operations on graphs
- Using PageRank
- Finding connected components
- Performing neighborhood aggregation
- Chapter 12. Optimizations and Performance Tuning
- Introduction
- Optimizing memory
- Using compression to improve performance
- Using serialization to improve performance
- Optimizing garbage collection
- Optimizing the level of parallelism
- Understanding the future of optimization – project Tungsten
- Index 更新時間:2021-07-16 13:44:17
推薦閱讀
- Web前端開發技術:HTML、CSS、JavaScript(第3版)
- 數據庫原理及應用(Access版)第3版
- 數據庫系統原理及MySQL應用教程
- 數據結構習題精解(C語言實現+微課視頻)
- Big Data Analytics
- NGINX Cookbook
- JavaScript腳本特效編程給力起飛
- Xcode 6 Essentials
- Scala Functional Programming Patterns
- 物聯網系統架構設計與邊緣計算(原書第2版)
- Building UIs with Wijmo
- INSTANT EaselJS Starter
- Internet of Things with Arduino Cookbook
- 深入實踐C++模板編程
- 編程真好玩:從零開始學網頁設計及3D編程
- Java程序設計
- Game Development with SlimDX
- Apple Watch極速開發
- SQL Server 2005數據庫項目教程
- Learning Physics Modeling with PhysX
- Statistics for Machine Learning
- Azure for Architects
- Building Android Games with Cocos2d-x
- Visual Basic實例精通
- SysML精粹
- 計算機軟件技術基礎(第2版)
- UI設計心理學
- WCF Multi-layer Services Development with Entity Framework(Fourth Edition)
- 超級軟件:下一代互聯網云平臺
- Hands-On Object:Oriented Programming with C#