Open-source batch ELT framework - Current source version v0.73.20

Build reliable data pipelines without hiding the machinery.¶

dpone is a production-oriented Python framework for moving data between databases, APIs, Kafka, and analytical targets with explicit state, staging-first loads, schema evolution, reconciliation, quality gates, and operational artifacts.

Start quickstart Install dpone Browse source -> sink guides

Postgres

MSSQL

ClickHouse

BigQuery

Kafka

REST APIs

What dpone is built for¶

Staging-first loading¶

Every database sink uses staging or shadow-table flows before target promotion, so heavy writes are kept away from final tables until commit time.

Explicit state¶

XMin, Kafka offsets, CDC offsets, run state, and source cursors can be persisted in supported state backends instead of disappearing into logs.

Schema evolution¶

Safe additions and widening are automated by default. Breaking type changes can fail fast or route into __dpone__nc__* generated columns.

Operational UX¶

Doctor, plan, run reports, quality gates, state inspection, connector certification, and performance advice are first-class workflows.

Fast paths¶

Need	Start here
Install and smoke-test dpone	Installation
Run dpone with native MSSQL/ClickHouse tools in Docker	Runtime Docker image
Run the first manifest	Quickstart
Execute from CLI or Python	Running pipelines
Try a local database pipeline	First local pipeline
Create the first Airflow DAG preview	First Airflow DAG — Start from your route (source → sink)
Configure database/API/Kafka credentials	Connections and credentials
Choose a pipeline combination	Source -> sink matrix
Pick append/upsert/replace semantics	Load strategies
Split nested JSON into root/child tables	Nested normalization
Understand type conversion	Type mapping matrix
Control automatic type detection	Type inference
Declare explicit column contracts	Schema contracts
Manage planned renames and aliases	Schema Identity
Gate migration packs by downstream impact	Schema impact
Tune target DDL, indexes, compression, and storage	Physical design
Enforce row contracts and quarantine bad rows	Runtime data contracts
Keep contracts safe on streaming/native fast paths	Streaming-safe contracts
Apply physical DDL safely	Physical DDL apply
Package schema and physical DDL changes into migration evidence, including shadow cutover, environment promotion, and PR/MR review bundles	Schema migration control
Use production-safe defaults	Production profiles
Bundle run evidence for certification	Unified run evidence
Decide whether a minor/major release can ship	Release evidence
Use PostgreSQL as source, sink, or state	PostgreSQL guide
Use SQL Server as source, sink, or state	MSSQL guide
Use BigQuery as analytical sink or state backend	BigQuery guide
Use ClickHouse as analytical source or sink	ClickHouse guide
Use bounded Kafka batch source/sink	Kafka guide
Use Postgres transaction-ID incremental extraction	Postgres XMin
Export Prometheus and OpenTelemetry runtime metrics	Runtime observability
Gate data product assertions, freshness, access/privacy, volume, latency, consumers, incidents, error budgets, policy waivers and fleet release freeze evidence	Data Product SLO, assertions and incidents
Turn blocked data product gates into safe owner-routed repair plans and controlled execution receipts	Data Product Remediation Runbooks
Operate certification, recovery, reconciliation, deployment, and catalog evidence	Operational control plane
Prove connectors with local-live/real-local/vendor-live gates	Live certification
Stage large files through S3/GCS/Azure	Object storage staging
Certify route capabilities before source IO	Route capability certification
Produce SBOM/provenance/signing evidence	Supply-chain evidence
Run certification, contracts, quarantine, rollback, and marketplace controls	`dpone ops`
Prepare for production operations	Production readiness
Understand CI/CD and release automation	CI/CD

Install¶

pip install dpone
dpone --version
dpone -v
pip install "dpone[postgres,mssql,clickhouse,kafka,gcp,s3,azure,pandas,vault]"

Local documentation preview¶

python -m pip install -r docs/requirements.txt
mkdocs serve

The GitHub Pages workflow builds the same site with mkdocs build --strict on every docs pull request and deploys from master.