MapReduce Paper (2004)
← Back to Key Papers to Read
“MapReduce: Simplified Data Processing on Large Clusters” (2004) — Dean, Ghemawat
Introduced the MapReduce programming model for processing large datasets in parallel across commodity clusters. Map phase processes input key/value pairs; reduce phase merges intermediate values. Abstracted away parallelization, fault tolerance, and distribution. Created the foundation for Hadoop MapReduce.
Related
google-internal papers mapreduce #2004