Skip to content
Open-source batch ELT framework

Build reliable data pipelines without hiding the machinery.

dpone is a production-oriented Python framework for moving data between databases, APIs, Kafka, and analytical targets with explicit state, staging-first loads, schema evolution, reconciliation, quality gates, and operational artifacts.

Postgres
MSSQL
ClickHouse
BigQuery
Kafka
REST APIs

What dpone is built for

Staging-first loading

Every database sink uses staging or shadow-table flows before target promotion, so heavy writes are kept away from final tables until commit time.

Explicit state

XMin, Kafka offsets, CDC offsets, run state, and source cursors can be persisted in supported state backends instead of disappearing into logs.

Schema evolution

Safe additions and widening are automated by default. Breaking type changes can fail fast or route into __dpone__nc__* generated columns.

Operational UX

Doctor, plan, run reports, quality checks, state inspection, connector certification, and performance advice are first-class workflows.

Fast paths

Need Start here
Install and smoke-test dpone Installation
Run the first manifest Quickstart
Execute from CLI or Python Running pipelines
Try a local database pipeline First local pipeline
Configure database/API/Kafka credentials Connections and credentials
Choose a pipeline combination Source -> sink matrix
Pick append/upsert/replace semantics Load strategies
Split nested JSON into root/child tables Nested normalization
Understand type conversion Type mapping matrix
Control automatic type detection Type inference
Declare explicit column contracts Schema contracts
Tune target DDL, indexes, compression, and storage Physical design
Enforce row contracts and quarantine bad rows Runtime data contracts
Keep contracts safe on streaming/native fast paths Streaming-safe contracts
Apply physical DDL safely Physical DDL apply
Use production-safe defaults Production profiles
Bundle run evidence for certification Unified run evidence
Decide whether a minor/major release can ship Release evidence
Use PostgreSQL as source, sink, or state PostgreSQL guide
Use SQL Server as source, sink, or state MSSQL guide
Use BigQuery as analytical sink or state backend BigQuery guide
Use ClickHouse as analytical source or sink ClickHouse guide
Use bounded Kafka batch source/sink Kafka guide
Use Postgres transaction-ID incremental extraction Postgres XMin
Export Prometheus and OpenTelemetry runtime metrics Runtime observability
Operate certification, recovery, reconciliation, deployment, and catalog evidence Operational control plane
Prove connectors with local-live/real-local/vendor-live gates Live certification
Stage large files through S3/GCS/Azure Object storage staging
Produce SBOM/provenance/signing evidence Supply-chain evidence
Run certification, contracts, quarantine, rollback, and marketplace controls dpone ops
Prepare for production operations Production readiness
Understand CI/CD and release automation CI/CD

Install

pip install dpone
dpone --version
dpone -v
pip install "dpone[postgres,mssql,clickhouse,kafka,gcp,s3,azure,pandas,vault]"

Local documentation preview

python -m pip install -r docs/requirements.txt
mkdocs serve

The GitHub Pages workflow builds the same site with mkdocs build --strict on every docs pull request and deploys from master.