- Cloudera Administration Handbook
- Rohit Menon
- 203字
- 2021-12-08 12:36:34
Components of Apache Hadoop
Apache Hadoop is composed of two core components. They are:
- HDFS: The HDFS is responsible for the storage of files. It is the storage component of Apache Hadoop, which was designed and developed to handle large files efficiently. It is a distributed filesystem designed to work on a cluster and makes it easy to store large files by splitting the files into blocks and distributing them across multiple nodes redundantly. The users of HDFS need not worry about the underlying networking aspects, as HDFS takes care of it. HDFS is written in Java and is a filesystem that runs within the user space.
- MapReduce: MapReduce is a programming model that was built from models found in the field of functional programming and distributed computing. In MapReduce, the task is broken down to two parts: map and reduce. All data in MapReduce flows in the form of key and value pairs,
<key, value>
. Mappers emit key and value pairs and the reducers receive them, work on them, and produce the final result. This model was specifically built to query/process the large volumes of data stored in HDFS.
We will be going through HDFS and MapReduce in depth in the next chapter.
推薦閱讀
- AutoCAD快速入門與工程制圖
- 3D Printing with RepRap Cookbook
- 基于LabWindows/CVI的虛擬儀器設計與應用
- 蕩胸生層云:C語言開發修行實錄
- 工業機器人入門實用教程(KUKA機器人)
- 網絡組建與互聯
- Embedded Programming with Modern C++ Cookbook
- 觸控顯示技術
- 電腦主板現場維修實錄
- 變頻器、軟啟動器及PLC實用技術260問
- Python:Data Analytics and Visualization
- Visual FoxPro程序設計
- 基于神經網絡的監督和半監督學習方法與遙感圖像智能解譯
- Learning ServiceNow
- 工業機器人力覺視覺控制高級應用