- Big Data Analytics with Hadoop 3
- Sridhar Alla
- 189字
- 2021-06-25 21:26:15
Combiner
If every output of every mapper is directly sent over to every reducer, this will consume a significant amount of resources and time. The combiner, an optional localized reducer, can group data in the map phase. It takes the intermediate keys from the mapper and applies a user-provided method to aggregate values in the small scope of that one mapper. For example, because the count of an aggregation is the sum of the counts of each part, you can produce an intermediate count, and then sum those intermediate counts for the final result. In many situations, this significantly reduces the amount of data that has to move over the network. For instance, if we look at the datasets of cities and temperatures, sending (Boston, 66) requires fewer bytes than sending (Boston, 20), (Boston, 25), (Boston, 21), three times over the network. Combiners often provide significant performance gains with no downsides.
We will point out which patterns benefit from using a combiner, and which ones cannot use a combiner. A combiner is not guaranteed to execute, so it cannot be a part of the overall algorithm.
- GNU-Linux Rapid Embedded Programming
- Splunk 7 Essentials(Third Edition)
- 面向STEM的mBlock智能機(jī)器人創(chuàng)新課程
- Excel 2007函數(shù)與公式自學(xué)寶典
- Practical Data Wrangling
- Effective DevOps with AWS
- Windows XP中文版應(yīng)用基礎(chǔ)
- 大數(shù)據(jù)技術(shù)入門(第2版)
- 3D Printing for Architects with MakerBot
- 可編程序控制器應(yīng)用實(shí)訓(xùn)(三菱機(jī)型)
- 自動(dòng)化生產(chǎn)線安裝與調(diào)試(三菱FX系列)(第二版)
- Visual Basic項(xiàng)目開發(fā)案例精粹
- Instant Slic3r
- Practical AWS Networking
- 分布式Java應(yīng)用