官术网_书友最值得收藏!

Map

The map function takes a series of key/value pairs, processes each, and generates zero or more output key/value pairs. The input and output types of the map can be (and often are) different from each other.

If the application is doing a word count, the map function will break the line into words and output a key/value pair for each word. Each output pair will contain the word as the key and the number of instances of that word in the line as the value.

In the mapper, code is executed on each key/value pair from the record reader to produce zero or more new key/value pairs, called the intermediate output of the mapper (which also consists of key/value pairs). The decision of what the key and value from each record is directly related to what the MapReduce job is accomplishing. The key is what the data will be grouped on and the value is the part of the data to be used in the reducer to generate the necessary output. One of the key items discussed in the patterns is how the different types of use cases also determine the particular key/value logic. In fact, the semantics of this logic is a key differentiator between MapReduce design patterns.

主站蜘蛛池模板: 宜兰县| 海南省| 富川| 天津市| 嘉峪关市| 慈溪市| 天全县| 富源县| 镇巴县| 中宁县| 札达县| 新竹县| 临高县| 分宜县| 珠海市| 湖州市| 大冶市| 仙居县| 卢湾区| 双辽市| 汽车| 故城县| 洪雅县| 虎林市| 板桥市| 雷州市| 延寿县| 乡宁县| 庆阳市| 梁平县| 阿拉善盟| 云龙县| 涿鹿县| 三门峡市| 平遥县| 长海县| 隆回县| 南投县| 黑山县| 开原市| 集安市|