書名： Hadoop Blueprints
作者名： Anurag Shrivastava Tanmay Deshpande
本章字數： 233字
更新時間： 2021-07-14 09:51:36

Chapter 1. Hadoop and Big Data

Hadoop has become the heart of the big data ecosystem. It is gradually evolving into a full-fledged data operating system. While there is no standard definition of big data, it is generally said that by big data we mean a huge volume of data, typically several petabytes in size, data arriving at huge velocity such as several thousand clickstreams per second, or data having variety in combination with volume such as images, click data, mails, blogs, tweets and Facebook posts, and so on. A big data-processing system will have to deal with any combination of volume, velocity and variety. These are also known as the 3Vs of big data and are often used to characterize the big data system. Some analysts and companies, most notably IBM, have added a fourth V that stands for veracity, to signify the correctness and accuracy problems associated with big datasets that exists at much lower levels in the enterprise datasets.

In this chapter, we will introduce you to the explosive growth of data around the turn of the century and the technological evolution that has led to the development of Hadoop. We will cover the following topics in this chapter:

The technical evolution of Hadoop
The rise of enterprise Hadoop
Hadoop design and tools
Developing a program to run on Hadoop
The overview of solution blueprints
Hadoop architectural patterns

官术网_书友最值得收藏!

Hadoop Blueprints

Chapter 1. Hadoop and Big Data