Aggregation in Mapreduce:

Lets take the example of Wordcount Mapreduce program that has two mappers
In the mapper, we write a logic in such a way that it returns the word as key and value as 1 i.e

Mapper 1 
(word1, 1)
(word2, 1)
(word1, 1)


Mapper2
(word2, 1)
(word3, 1)

When mappers returns the data, the intermediate results are written to disk then sent across the network to Reducers for final processing. The latency of writing this data directly to disk and then transferring data across the network is an expensive operation in the processing of a MapReduce job. Hence there is a necessity to reduce the amount of data that needs to be sent across network to reducer whenever possible.

Aggregation is a technique in Map reduce used to reduce the amount of data and improve the efficiency of our MapReduce job.

We use combiner to perform aggregation in mapreduce. The job of a combiner is to aggregate data with the net result of less data begin shuffled across the network. 

Note: Aggregation can not take the place of reducers, as we need a way to gather results with the same key from different mappers as Combiner will gather the results with same way for each mapper separately.

As stated before, keep in mind that reducers are still required to put together results with the same keys coming from different mappers. Since combiner functions are an optimization, the Hadoop framework offers no guarantees on how many times a combiner will be called, if at all. 

For the above data returned by mapper, combiner output will be as follows:

Combiner for mapper1 
{word1, [1,1]}
{word2, [1]}

Combiner for mapper12
{word2, [1]}
{word3, [1]}

the above results will be passed to reducer for final processing.

Consolidation in Map reduce:
Reducer will consolidate the results from different mappers and gives the final output.

Hence the consolidated output for the word count program would be as:

word1, 2
word2, 2
word3, 1

This is how the aggregation and consolidation works in Mapreduce.

Hope this helps you.

Please feel free to revert if you need any further help.

If you feel satisfied with my response kindly leave your feedback by clicking on any one of the below smileys