官术网_书友最值得收藏!

Preface

Apache Hadoop is an open source distributed computing technology that assists users in processing large volumes of data with relative ease, helping them to generate tremendous insights into their data. Cloudera, with their open source distribution of Hadoop, has made data analytics on Big Data possible and accessible to anyone interested.

This book fully prepares you to be a Hadoop administrator, with special emphasis on Cloudera. It provides step-by-step instructions on setting up and managing a robust Hadoop cluster running Cloudera's Distribution Including Apache Hadoop (CDH).

This book starts out by giving you a brief introduction to Apache Hadoop and Cloudera. You will then move on to learn about all the tools and techniques needed to set up and manage a production-standard Hadoop cluster using CDH and Cloudera Manager.

In this book, you will learn the Hadoop architecture by understanding the different features of HDFS and walking through the entire flow of a MapReduce process. With this understanding, you will start exploring the different applications packaged into CDH and will follow a step-by-step guide to set up HDFS High Availability (HA) and HDFS Federation.

You will learn to use Cloudera Manager, Cloudera's cluster management application. Using Cloudera Manager, you will walk through the steps to configure security using Kerberos, learn about events and alerts, and also configure backups.

主站蜘蛛池模板: 五大连池市| 嫩江县| 哈尔滨市| 惠安县| 湛江市| 游戏| 玉树县| 木里| 昌江| 元阳县| 会宁县| 容城县| 大港区| 普兰店市| 乌拉特中旗| 佳木斯市| 门头沟区| 资兴市| 左云县| 星子县| 肃南| 夏津县| 望谟县| 扎鲁特旗| 蒙阴县| 怀化市| 永德县| 乐山市| 新河县| 扎囊县| 栾城县| 泾川县| 呼伦贝尔市| 夏津县| 唐海县| 汝阳县| 图们市| 泰和县| 泾源县| 邳州市| 泉州市|