官术网_书友最值得收藏!

Big Data With Hadoop

Hadoop has become the de facto standard in the world of big data, especially over the past three to four years. Hadoop started as a subproject of Apache Nutch in 2006 and introduced two key features related to distributed filesystems and distributed computing, also known as MapReduce, that caught on very rapidly among the open source community. Today, there are thousands of new products that have been developed leveraging the core features of Hadoop, and it has evolved into a vast ecosystem consisting of more than 150 related major products. Arguably, Hadoop was one of the primary catalysts that started the big data and analytics industry.

In this chapter, we will discuss the background and core concepts of Hadoop, the components of the Hadoop platform, and delve deeper into the major products in the Hadoop ecosystem. We will learn about the core concepts of distributed filesystems and distributed processing and optimizations to improve the performance of Hadoop deployments. We'll conclude with real-world hands-on exercises using the Cloudera Distribution of Hadoop (CDH). The topics we will cover are:

  • The basics of Hadoop
  • The core components of Hadoop
  • Hadoop 1 and Hadoop 2
  • The Hadoop Distributed File System
  • Distributed computing principles with MapReduce
  • The Hadoop ecosystem
  • Overview of the Hadoop ecosystem
  • Hive, HBase, and more
  • Hadoop Enterprise deployments
  • In-house deployments
  • Cloud deployments
  • Hands-on with Cloudera Hadoop
  • Using HDFS
  • Using Hive
  • MapReduce with WordCount
主站蜘蛛池模板: 江油市| 崇仁县| 内江市| 天全县| 新野县| 新龙县| 册亨县| 维西| 荔波县| 抚顺县| 海丰县| 东丽区| 丽水市| 喀喇沁旗| 荃湾区| 黄山市| 陈巴尔虎旗| 望城县| 天镇县| 红原县| 龙海市| 阳高县| 葫芦岛市| 高陵县| 郓城县| 甘泉县| 黑山县| 霍城县| 翁源县| 沁水县| 丰县| 方正县| 河津市| 吐鲁番市| 宿迁市| 石景山区| 西乌珠穆沁旗| 资源县| 贵港市| 洪湖市| 诏安县|