Trino

← Back to Batch Processing

Open source distributed SQL query engine (formerly PrestoSQL, originated at Facebook) for interactive analytics across heterogeneous data sources. Queries data in place without requiring ETL into a separate system — a single SQL query can join across a data lake, a relational database, and a key-value store.

Key Properties

How It Works

A coordinator node parses SQL, plans the query, and distributes work across worker nodes. Connectors abstract data sources (Hive/S3, PostgreSQL, Kafka, Elasticsearch, etc.) behind a common interface. Executes entirely in memory with pipelined stages — no intermediate disk writes — optimized for low-latency interactive queries rather than long-running ETL.

Connectors

Hive / S3 — query Parquet, ORC, Avro files in data lakes
Iceberg / Delta Lake — table formats with ACID semantics
RDBMS — PostgreSQL, MySQL, SQL Server
NoSQL — Cassandra, Elasticsearch, MongoDB, Redis

Data Lakehouse (Trino commonly used as the query layer)
Data Warehouse (alternative approach — Trino queries data in place instead)
Apache Spark (complementary — Spark for heavy ETL, Trino for interactive queries)
BigQuery (managed alternative on GCP)

data-pipelines batch trino

Software Engineering KB

Explorer

Trino

Trino

Key Properties

How It Works

Connectors

Graph View

Table of Contents

Backlinks

Software Engineering KB

Explorer

Trino

Trino

Key Properties

How It Works

Connectors

Related

Graph View

Table of Contents

Backlinks