MapReduce
← Back to Data Parallelism
A programming model for processing large datasets in parallel across a distributed cluster. Data is distributed, processed in parallel (map phase), shuffled by key, and combined (reduce phase). The framework handles distribution, fault tolerance, and scheduling.