MapReduce (Pipelines)

Back to Apache Hadoop

A programming model for distributed processing. Map phase transforms input into key-value pairs. Reduce phase aggregates values by key. Disk-based, slower than in-memory alternatives.

data-pipelines batch hadoop mapreduce