書名： Apache Hadoop 3 Quick Start Guide
作者名： Hrishikesh Vijay Karambelkar
本章字數： 161字
更新時間： 2021-06-10 19:18:44

Planning and sizing clusters

Once you start working on problems and implementing Hadoop clusters, you'll have to deal with the issue of sizing. It's not just the sizing aspect of clusters that needs to be considered, but the SLAs associated with Hadoop runtime as well. A cluster can be categorized based on workloads as follows:

Lightweight: This category is intended for low computation and fewer storage requirements, and is more useful for defined datasets with no growth
Balanced: A balanced cluster can have storage and computation requirements that grow over time
Storage-centric: This category is more focused towards storing data, and less towards computation; it is mostly used for archival purposes, as well as minimal processing
Computational-centric: This cluster is intended for high computation which requires CPU or GPU-intensive work, such as analytics, prediction, and data mining

Before we get on to solve the sizing problem of a Hadoop cluster, however, we have to understand the following topics.

官术网_书友最值得收藏!

Apache Hadoop 3 Quick Start Guide

Planning and sizing clusters