MapReduce

← Back to Data Parallelism

A programming model for processing large datasets in parallel across a distributed cluster. Data is distributed, processed in parallel (map phase), shuffled by key, and combined (reduce phase). The framework handles distribution, fault tolerance, and scheduling.

Key Properties

Distribute Data Process in Parallel Combine Results

concurrency data-parallelism mapreduce

Software Engineering KB

Explorer

MapReduce

MapReduce

Key Properties

Graph View

Table of Contents

Backlinks