- Big Data Analytics with Hadoop 3
- Sridhar Alla
- 171字
- 2021-06-25 21:26:17
SingleMapperCombinerReducer job
SingleMapperReducer jobs are used in aggregation use cases. A combiner, also known as a semi-reducer, is an optional class that operates by accepting the inputs from the map class and thereafter passing the output key/value pairs to the reducer class. The purpose of the combiner is to reduce the workload of the reducer:

In the MapReduce program, 25% of the work is done in the map stage, which is also known as the data preparation stage, which works in parallel. At the same time, 75% of the work is done in the reduce stage, which is known as the calculation stage, and is not parallel. Therefore, it is slower than the map phase. To reduce time, some work in the reduce phase can be done in the combiner phase.
For example, if we have a combiner, then we will send (Boston, 66) from a mapper, which sees (Boston, 22), (Boston, 24), (Boston, 20) as input records, instead of sending three inpidual key/pair records across the network.