- Big Data Analytics
- Venkat Ankam
- 175字
- 2021-08-20 10:32:24
Summary
Apache Hadoop provides you with a reliable and scalable framework (HDFS) for Big Data storage and a powerful cluster resource management framework (YARN) to run and manage multiple Big Data applications. Apache Spark provides in-memory performance in Big Data processing and libraries and APIs for interactive exploratory analytics, real-time analytics, machine learning, and graph analytics. While MR was the primary processing engine on top of Hadoop, it had multiple drawbacks, such as poor performance and inflexibility in designing applications. Apache Spark is a replacement for MR. All MR-based tools, such as Hive, Pig, Mahout, and Crunch, have already started offering Apache Spark as an additional execution engine apart from MR.
Nowadays, Big Data projects are being implemented in many businesses, from large Fortune 500 companies to small start-ups. Organizations gain an edge if they can go from raw data to decisions quickly with easy-to-use tools to develop applications and explore data. Apache Spark will bring this speed and sophistication to Hadoop clusters.
In the next chapter, let's dive deep into Spark and learn Spark.
- Spring Boot 2實戰(zhàn)之旅
- C++程序設計(第3版)
- Bulma必知必會
- Learning FuelPHP for Effective PHP Development
- 批調(diào)度與網(wǎng)絡問題的組合算法
- HTML5秘籍(第2版)
- Quantum Computing and Blockchain in Business
- Spring Boot+MVC實戰(zhàn)指南
- C++ Fundamentals
- Arduino Wearable Projects
- MyBatis 3源碼深度解析
- 物聯(lián)網(wǎng)系統(tǒng)架構(gòu)設計與邊緣計算(原書第2版)
- 高性能PHP 7
- 透視C#核心技術(shù):系統(tǒng)架構(gòu)及移動端開發(fā)
- C語言從入門到精通(第5版)