Map Reduce model with example
The MapReduce programming model is clearly summarized in the following quote:
“The computation takes a set of input key/value pairs, and produces a set of output key/value pairs. The user of the MapReduce library expresses the computation as two functions: map and reduce.
Map, written by the user, takes an input pair and produces a set of inter- mediate key/value pairs. The MapReduce library groups together all in- termediate values associated with the same intermediate key I and passes them to the reduce function.
The reduce function, also written by the user, accepts an intermediate key I and a set of values for that key. It merges together these values to form a possibly smaller set of values. Typically just zero or one output value is produced per reduce invocation. The intermediate values are supplied to the user’s reduce function via an iterator. This allows us to handle lists of values that are too large to fit in memory.”
We also quote an example including pseudo-code:
”Consider the problem of counting the number of occurrences of each word in a large collection of documents. The user would write code similar to the following pseudo-code:
“The computation takes a set of input key/value pairs, and produces a set of output key/value pairs. The user of the MapReduce library expresses the computation as two functions: map and reduce.
Map, written by the user, takes an input pair and produces a set of inter- mediate key/value pairs. The MapReduce library groups together all in- termediate values associated with the same intermediate key I and passes them to the reduce function.
The reduce function, also written by the user, accepts an intermediate key I and a set of values for that key. It merges together these values to form a possibly smaller set of values. Typically just zero or one output value is produced per reduce invocation. The intermediate values are supplied to the user’s reduce function via an iterator. This allows us to handle lists of values that are too large to fit in memory.”
We also quote an example including pseudo-code:
”Consider the problem of counting the number of occurrences of each word in a large collection of documents. The user would write code similar to the following pseudo-code:
map(String key, String value):
// key: document name
// value: document contents
for each word w in value:
EmitIntermediate(w, "1");
reduce(String key, Iterator values):
// key: a word
// values: a list of counts
int result = 0;
for each v in values:
result += ParseInt(v);
Emit(AsString(result));
Thanks 😊
ReplyDeleteThese are really helpful!