官术网_书友最值得收藏!

Planning and Setting Up Hadoop Clusters

In the last chapter, we looked at big data problems, the history of Hadoop, along with an overview of big data, Hadoop architecture, and commercial offerings. This chapter will focus on hands-on, practical knowledge of how to set up Hadoop in different configurations. Apache Hadoop can be set up in the following three different configurations:

  • Developer mode: Developer mode can be used to run programs in a standalone manner. This arrangement does not require any Hadoop process daemons, and jars can run directly. This mode is useful if developers wish to debug their code on MapReduce.
  • Pseudo cluster (single node Hadoop): A pseudo cluster is a single node cluster that has similar capabilities to that of a standard cluster; it is also used for the development and testing of programs before they are deployed on a production cluster. Pseudo clusters provide an independent environment for all developers for coding and testing. 
  • Cluster mode: This mode is the real Hadoop cluster where you will set up multiple nodes of Hadoop across your production environment. You should use it to solve all of your big data problems.

This chapter will focus on setting up a new Hadoop cluster. The standard cluster is the one used in the production, as well as the staging, environment. It can also be scaled down and used for development in many cases to ensure that programs can run across clusters, handle fail-over, and so on. In this chapter, we will cover the following topics:

  • Prerequisites for Hadoop
  • Running Hadoop in development mode
  • Setting up a pseudo Hadoop custer
  • Sizing the cluster

  • Setting up Hadoop in cluster mode
  • Diagnosing the Hadoop cluster
主站蜘蛛池模板: 枣强县| 邯郸县| 新余市| 成武县| 甘谷县| 苍南县| 湛江市| 四平市| 永修县| 化州市| 广河县| 杭锦旗| 民丰县| 泽库县| 磐石市| 迁安市| 同仁县| 都匀市| 宜阳县| 阿瓦提县| 缙云县| 大安市| 博爱县| 昌邑市| 抚顺市| 西青区| 涪陵区| 阳信县| 贞丰县| 德清县| 孝昌县| 苏尼特右旗| 西充县| 玉田县| 临西县| 古田县| 邹城市| 犍为县| 南部县| 棋牌| 元朗区|