官术网_书友最值得收藏!

Combiner

If every output of every mapper is directly sent over to every reducer, this will consume a significant amount of resources and time. The combiner, an optional localized reducer, can group data in the map phase. It takes the intermediate keys from the mapper and applies a user-provided method to aggregate values in the small scope of that one mapper. For example, because the count of an aggregation is the sum of the counts of each part, you can produce an intermediate count, and then sum those intermediate counts for the final result. In many situations, this significantly reduces the amount of data that has to move over the network. For instance, if we look at the datasets of cities and temperatures, sending (Boston, 66) requires fewer bytes than sending (Boston, 20), (Boston, 25), (Boston, 21), three times over the network. Combiners often provide significant performance gains with no downsides.

We will point out which patterns benefit from using a combiner, and which ones cannot use a combiner. A combiner is not guaranteed to execute, so it cannot be a part of the overall algorithm.

主站蜘蛛池模板: 潞城市| 平遥县| 南华县| 漳州市| 平谷区| 衡南县| 柳州市| 西青区| 即墨市| 江孜县| 黄冈市| 桃源县| 和林格尔县| 白城市| 西平县| 桑植县| 特克斯县| 随州市| 郎溪县| 沿河| 家居| 台东县| 邵阳县| 忻州市| 阿拉善右旗| 纳雍县| 平泉县| 淅川县| 保靖县| 卫辉市| 凤城市| 达日县| 陇西县| 龙江县| 武平县| 嘉定区| 南阳市| 三原县| 济源市| 洛扎县| 罗山县|