- Fast Data Processing with Spark 2(Third Edition)
- Krishna Sankar
- 155字
- 2021-08-20 10:27:05
Preface
Apache Spark has captured the imagination of the analytics and big data developers, rightfully so. In a nutshell, Spark enables distributed computing at scale in the lab or in production. Until now, the collect-store-transform pipeline was distinct from the data science Reason-Model pipeline , which was again distinct from the deployment of the analytics and machine learning models. Now with Spark and technologies such as Kafka, we can seamlessly span the data management and data science pipelines. Moreover, now we can build data science models on larger datasets and need not just sample data. And whatever models we build can be deployed into production (with added work from engineering on the “ilities”, of course). It is our hope that this book will enable a data engineer to get familiar with the fundamentals of the Spark platform as well as provide hands-on experience of some of the advanced capabilities.
- scikit-learn Cookbook
- 編程的修煉
- Building a Home Security System with Raspberry Pi
- Vue.js 2 and Bootstrap 4 Web Development
- INSTANT MinGW Starter
- Backbone.js Blueprints
- Learning DHTMLX Suite UI
- Python爬蟲、數據分析與可視化:工具詳解與案例實戰
- Hands-On Nuxt.js Web Development
- Elasticsearch Essentials
- Mastering Adobe Captivate 7
- 面向對象程序設計及C++(第3版)
- jQuery Essentials
- Node.js Web Development
- Selenium Essentials