Phlo
</>PYTHON FRAMEWORK FOR LAKEHOUSES

Build data workflows that flow to production.

Phlo is a modular Python framework for lakehouse applications. Add packages for orchestration, storage, ingestion, quality, observability, and UI extensions, then run the platform from one project.

Read the docsGet started
phlo quickstart
# choose the lakehouse services you need
$ uv pip install phlo[defaults]
$ phlo init my-lakehouse
$ phlo services start
  dagster  minio  nessie  trino

# materialize assets with checks and lineage
$ phlo materialize --select "dlt_glucose_entries+"
  checks passed · lineage updated
$ phlo logs --asset glucose_entries
Services
dagster + nessie
Quality checks
run with materializations
Lineage
updated per run
Observatory
pipeline visibility
</>PROJECT WORKFLOW

Run a lakehouse as one project.

Phlo wraps the moving parts of a lakehouse into a Python project: services, workflows, materializations, checks, lineage, and observability.

  1. 01

    Bootstrap a project

    Create the workspace, configuration, env files, and workflow folders Phlo expects.

  2. 02

    Start the lakehouse stack

    Bring up the services your project uses: orchestration, storage, catalog, query, and UI.

  3. 03

    Materialize and inspect

    Run assets, evaluate checks, then inspect lineage, logs, metrics, and service state.

Project

phlo.yaml, env, workflows

Services

Dagster, MinIO, Nessie, Trino

Pipelines

DLT ingestion, dbt transforms

Quality

schemas and checks on runs

Catalog

branches, tables, lineage

Observe

Observatory, logs, metrics

phlo initphlo services startphlo materialize

The point is not the plumbing. The point is a repeatable run with data, checks, lineage, and service state in one place.

</>WHAT PHLO DOES

Move from source data to governed assets.

Phlo gives the common lakehouse path a project shape: ingest data, merge it into tables, transform it, validate the run, and make the result observable.

initservicesmaterializeinspect
# create the project and bring up the stack
$ uv pip install phlo[defaults]
$ phlo init glucose-lakehouse
$ phlo services start
  dagster  minio  nessie  trino  observatory

# run assets and inspect what happened
$ phlo materialize --select "dlt_glucose_entries+"
  materialized assets, checks evaluated
$ phlo logs --asset glucose_entries --since 1h
$ phlo lineage show glucose_entries
  1. 01

    Ingest data

    Fetch from APIs, files, and databases, then stage data before it reaches tables.

  2. 02

    Store and branch

    Write to Iceberg-backed storage with catalog branches for reviewable lakehouse changes.

  3. 03

    Transform and check

    Run dbt or Python workflows and evaluate quality checks during materialization.

  4. 04

    Observe the run

    Use Observatory, logs, lineage, and metrics to understand pipeline health.

One project surface

Configuration, workflows, services, schemas, and transforms live in the same project.

Open lakehouse stack

Phlo coordinates tools like Dagster, MinIO, Nessie, Trino, dbt, and Observatory.

Run visibility

Materializations carry checks, logs, lineage, metrics, and enough context to debug.

</>ARCHITECTURE & GOVERNANCE

Designed for open lakehouse stacks

Phlo's core stays small. Packages bring the lakehouse services and integrations, while open formats keep the data portable.

Installed packages
Phlo project

CLI · services · checks · materialization

Dagster
MinIO
Nessie
Trino
dbt
Observatory

Services

Postgres, MinIO, Nessie, Trino, Dagster, and optional stack packages.

Sources

Connector packages bring APIs, databases, files, and domain inputs into the lakehouse.

Quality

Checks run with materialized assets and pipeline executions.

Transformations

dbt and Python packages turn raw tables into modeled assets.

Cataloging

Nessie and OpenMetadata packages track branches, schema, and lineage.

Observability

Prometheus, Grafana, Loki, logs, metrics, and alerts plug in as packages.

Observatory

UI extension packages add routes, slots, settings, and dashboards.

Runtime

Adapters translate package contributions into the active workflow engine.

READY WHEN YOU ARE

Ship governed data workflows with confidence.

Install core Phlo, add the packages your stack needs, start the services, and materialize assets with checks attached to the run.

~/my-lakehouse
$ uv pip install phlo
$ uv add phlo-dagster phlo-nessie phlo-minio phlo-trino
$ uv add phlo-dbt phlo-quality phlo-observatory
$ phlo init my-lakehouse
$ phlo services start
  services ready: dagster, minio, nessie, trino
$ phlo materialize --select "dlt_glucose_entries+"
  materialized assets with package-provided checks