- Hadoop Blueprints
- Anurag Shrivastava Tanmay Deshpande
- 233字
- 2021-07-14 09:51:36
Chapter 1. Hadoop and Big Data
Hadoop has become the heart of the big data ecosystem. It is gradually evolving into a full-fledged data operating system. While there is no standard definition of big data, it is generally said that by big data we mean a huge volume of data, typically several petabytes in size, data arriving at huge velocity such as several thousand clickstreams per second, or data having variety in combination with volume such as images, click data, mails, blogs, tweets and Facebook posts, and so on. A big data-processing system will have to deal with any combination of volume, velocity and variety. These are also known as the 3Vs of big data and are often used to characterize the big data system. Some analysts and companies, most notably IBM, have added a fourth V that stands for veracity, to signify the correctness and accuracy problems associated with big datasets that exists at much lower levels in the enterprise datasets.
In this chapter, we will introduce you to the explosive growth of data around the turn of the century and the technological evolution that has led to the development of Hadoop. We will cover the following topics in this chapter:
- The technical evolution of Hadoop
- The rise of enterprise Hadoop
- Hadoop design and tools
- Developing a program to run on Hadoop
- The overview of solution blueprints
- Hadoop architectural patterns
- JavaScript前端開發模塊化教程
- PyTorch Artificial Intelligence Fundamentals
- 編寫高質量代碼:改善Python程序的91個建議
- Mastering OpenCV 4
- Hands-On GPU:Accelerated Computer Vision with OpenCV and CUDA
- Creating Stunning Dashboards with QlikView
- C#程序設計教程(第3版)
- 愛上micro:bit
- 編程的原則:改善代碼質量的101個方法
- Java程序設計教程
- Eclipse開發(學習筆記)
- Java程序性能優化實戰
- Neo4j權威指南 (圖數據庫技術叢書)
- Daniel Arbuckle's Mastering Python
- Android Application Programming with OpenCV 3