官术网_书友最值得收藏!

Summary

We have covered a lot of ground in this chapter and we now have the foundation to explore MapReduce in more detail. Specifically, we learned how key/value pairs is a broadly applicable data model that is well suited to MapReduce processing. We also learned how to write mapper and reducer implementations using the 0.20 and above versions of the Java API.

We then moved on and saw how a MapReduce job is processed and how the map and reduce methods are tied together by significant coordination and task-scheduling machinery. We also saw how certain MapReduce jobs require specialization in the form of a custom partitioner or combiner.

We also learned how Hadoop reads data to and from the filesystem. It uses the concept of InputFormat and OutputFormat to handle the file as a whole and RecordReader and RecordWriter to translate the format to and from key/value pairs.

With this knowledge, we will now move on to a case study in the next chapter, which demonstrates the ongoing development and enhancement of a MapReduce application that processes a large data set.

主站蜘蛛池模板: 隆回县| 方正县| 万安县| 青州市| 昭苏县| 湘潭市| 吉安县| 土默特右旗| 祥云县| 铜川市| 高邑县| 康平县| 襄垣县| 芮城县| 公主岭市| 青田县| 甘南县| 绥宁县| 托克逊县| 玛纳斯县| 民县| 扶余县| 青神县| 邵东县| 临江市| 峨眉山市| 枣阳市| 临高县| 定边县| 阜宁县| 清水河县| 榆树市| 内黄县| 安阳县| 息烽县| 阳城县| 滨州市| 胶州市| 新闻| 双流县| 老河口市|