MapReduce to Hadoop Mapping

Back to Data Systems Mapping

Google InternalOpen SourceGCP
SystemMapReduceHadoop MapReduce— (legacy)
ConceptBatch processingBatch processing

Google’s MapReduce (2004 paper) defined the paradigm for distributed batch data processing. Hadoop MapReduce was directly inspired by the paper and became the industry standard. Google internally superseded MapReduce with FlumeJava/Flume. No direct GCP equivalent as the paradigm evolved to Dataflow/Beam.


google-internal mapping mapreduce hadoop