RDDs

Back to Apache Spark

Resilient Distributed Datasets. Spark’s fundamental abstraction: an immutable, distributed collection of objects that can be processed in parallel with fault tolerance through lineage tracking.

data-pipelines batch spark rdd