</>PYTHON FRAMEWORK FOR LAKEHOUSES

Build data workflows that flow to production.

Phlo is a modular Python framework for lakehouse applications. Add packages for orchestration, storage, ingestion, quality, observability, and UI extensions, then run the platform from one project.

Read the docs Get startedpip install phlo

phlo quickstart

my-lakehouse

# choose the lakehouse services you need
$ uv pip install phlo[defaults]
$ phlo init my-lakehouse
$ phlo services start
  dagster  minio  nessie  trino

# materialize assets with checks and lineage
$ phlo materialize --select "dlt_glucose_entries+"
  checks passed · lineage updated
$ phlo logs --asset glucose_entries

Services

dagster + nessie

Quality checks

run with materializations

Lineage

updated per run

Observatory

pipeline visibility

</>PROJECT WORKFLOW

Run a lakehouse as one project.

Phlo wraps the moving parts of a lakehouse into a Python project: services, workflows, materializations, checks, lineage, and observability.

01
Bootstrap a project
Create the workspace, configuration, env files, and workflow folders Phlo expects.
02
Start the lakehouse stack
Bring up the services your project uses: orchestration, storage, catalog, query, and UI.
03
Materialize and inspect
Run assets, evaluate checks, then inspect lineage, logs, metrics, and service state.

Project

phlo.yaml, env, workflows

Services

Dagster, MinIO, Nessie, Trino

Pipelines

DLT ingestion, dbt transforms

Quality

schemas and checks on runs

Catalog

branches, tables, lineage

Observe

Observatory, logs, metrics

phlo initphlo services startphlo materialize

The point is not the plumbing. The point is a repeatable run with data, checks, lineage, and service state in one place.

</>WHAT PHLO DOES

Move from source data to governed assets.

Phlo gives the common lakehouse path a project shape: ingest data, merge it into tables, transform it, validate the run, and make the result observable.

initservicesmaterializeinspect

# create the project and bring up the stack
$ uv pip install phlo[defaults]
$ phlo init glucose-lakehouse
$ phlo services start
  dagster  minio  nessie  trino  observatory

# run assets and inspect what happened
$ phlo materialize --select "dlt_glucose_entries+"
  materialized assets, checks evaluated
$ phlo logs --asset glucose_entries --since 1h
$ phlo lineage show glucose_entries

01
Ingest data
Fetch from APIs, files, and databases, then stage data before it reaches tables.
02
Store and branch
Write to Iceberg-backed storage with catalog branches for reviewable lakehouse changes.
03
Transform and check
Run dbt or Python workflows and evaluate quality checks during materialization.
04
Observe the run
Use Observatory, logs, lineage, and metrics to understand pipeline health.

One project surface

Configuration, workflows, services, schemas, and transforms live in the same project.

Open lakehouse stack

Phlo coordinates tools like Dagster, MinIO, Nessie, Trino, dbt, and Observatory.

Run visibility

Materializations carry checks, logs, lineage, metrics, and enough context to debug.

</>ARCHITECTURE & GOVERNANCE

Designed for open lakehouse stacks

Phlo's core stays small. Packages bring the lakehouse services and integrations, while open formats keep the data portable.

Installed packages

Phlo project

CLI · services · checks · materialization

Dagster

MinIO

Nessie

Trino

dbt

Observatory

Services

Postgres, MinIO, Nessie, Trino, Dagster, and optional stack packages.

Sources

Connector packages bring APIs, databases, files, and domain inputs into the lakehouse.

Quality

Checks run with materialized assets and pipeline executions.

Transformations

dbt and Python packages turn raw tables into modeled assets.

Cataloging

Nessie and OpenMetadata packages track branches, schema, and lineage.

Observability

Prometheus, Grafana, Loki, logs, metrics, and alerts plug in as packages.

Observatory

UI extension packages add routes, slots, settings, and dashboards.

Runtime

Adapters translate package contributions into the active workflow engine.

</>READ THE SYSTEM

Guides that explain the shape of Phlo

Start with the package model that users install and operate, then go deeper only when you are authoring an integration package.

Guide 01

READY WHEN YOU ARE

Ship governed data workflows with confidence.

Install core Phlo, add the packages your stack needs, start the services, and materialize assets with checks attached to the run.

Install Phlo Read the docs

~/my-lakehouse

$ uv pip install phlo
$ uv add phlo-dagster phlo-nessie phlo-minio phlo-trino
$ uv add phlo-dbt phlo-quality phlo-observatory
$ phlo init my-lakehouse
$ phlo services start
  services ready: dagster, minio, nessie, trino
$ phlo materialize --select "dlt_glucose_entries+"
  materialized assets with package-provided checks

Build data workflows that flow to production.

Run a lakehouse as one project.

Bootstrap a project

Start the lakehouse stack

Materialize and inspect

Project

Services

Pipelines

Quality

Catalog

Observe

Move from source data to governed assets.

Ingest data

Store and branch

Transform and check

Observe the run

One project surface

Open lakehouse stack

Run visibility

Designed for open lakehouse stacks

Services

Sources

Quality

Transformations

Cataloging

Observability

Observatory

Runtime

Guides that explain the shape of Phlo

Getting started

Plugin system

Custom packages

Observatory extensions

Ship governed data workflows with confidence.