DataFrames
← Back to Apache Spark
A distributed collection of data organized into named columns, similar to a table in a relational database. Higher-level abstraction than RDDs with optimization through Catalyst query optimizer.
← Back to Apache Spark
A distributed collection of data organized into named columns, similar to a table in a relational database. Higher-level abstraction than RDDs with optimization through Catalyst query optimizer.