Testing documentation¶

This folder is the single home for dpone testing documentation. Start here when you need to choose a test layer, run a gate, debug a red workflow, or understand the source -> sink certification matrix.

Documentation map¶

Need	Doc
Credential-free pipeline behavior and CI fixtures	Hermetic pipeline tests
Local/default regression gate, package smoke, generated docs checks	Local/default testing
Service markers, live/vendor boundaries, Docker services, artifact policy	Integration tests
End-to-end chunked backfill matrix and Airflow interval integration	Backfill integration
Replay recovery/control-plane gate for resync/resume adapters	Replay integration
Manual source -> sink matrix category and GitHub Actions workflow	Manual integration matrix
Critical-route type matrix, physical DDL and temporal fidelity certification	Type mapping matrix
Manual local-live/vendor-live connector certification and benchmark/SLO evidence	Live certification
Detailed 200-case source -> sink strategy behavior model, volumes, artifacts, failure recovery	Source/sink matrix runbook
SQL Server local wide-table mock matrix and MSSQL-specific troubleshooting	Local MSSQL mock matrix
Local native transfer benchmarks for Postgres -> MSSQL and MSSQL -> ClickHouse	Native transfer benchmark artifact: 2026-06-11
Connector capability evidence	Connector certification
CI/CD workflow map and red-build runbooks	CI/CD

TDD policy¶

dpone development follows red-green-refactor across the test pyramid:

Red — before touching implementation code, write (or extend) a failing test at the lowest layer that can express the requirement.
Green — implement the minimal change that makes the test pass; run the focused test file, then the affected suites.
Refactor — clean up under a green bar; architecture fitness, import rules, and module-size gates must stay green (uv run pytest tests/test_architecture_fitness_gate.py, uv run dpone docs check-import-rules, uv run dpone docs check-module-size --baseline docs/module_size_baseline.json).

Which layer is mandatory for what:

Layer	Mandatory when	Example
Unit (pure logic, fakes)	always — every behavior change ships with unit coverage	chunk planner boundaries, config validation
Contract (module boundaries, schemas, CLI output)	public contract, schema, CLI, or artifact format changes	XCom summary schema, strategy registration
Integration (`integration_*` markers, local Docker)	data movement semantics, sink/source finalization, state/ledger behavior	backfill route matrix, staging finalizers
E2E / certification (matrix and live workflows)	release gates, cross-route guarantees, vendor paths	`mock_local` matrix, live certification

Rules of thumb:

a bug fix starts with a regression test that reproduces the bug;
new sink/source/strategy mechanics are not "done" without an integration_* case that moves real rows and checks parity;
shared test helpers live in reusable toolkits (see tests/integration/backfill/backfill_toolkit.py) instead of per-test SQL copy-paste;
the coverage gate (fail_under in pyproject.toml) is a ratchet: raise it as the measured baseline grows, never lower it.

Quick decision guide¶

flowchart TD
    A["What changed?"] --> B{"Runtime code or public contract?"}
    B -- yes --> C["Run default regression gate"]
    B -- docs only --> D["Run docs contract tests and mkdocs build"]
    C --> E{"Connector, strategy, or source/sink behavior?"}
    E -- yes --> F["Run mock_contract matrix"]
    E -- local DB/Kafka path --> G["Run mock_local or service marker"]
    E -- vendor/API path --> H["Run vendor_live certification"]
    D --> I["Update docs links/nav/index if needed"]

Test layers¶

Layer	Marker / workflow	External credentials	Purpose
Unit/contract	default pytest	no	Fast local confidence for models, planners, codecs, SQL builders, docs contracts, CLI output.
Default non-live gate	`pytest -m "not integration_live"`	no	Release-quality local gate for ordinary changes.
`mock_contract` matrix	`integration_matrix`	no	Validates all 25 source -> sink pairs and 200 strategy contracts, including `snapshot_diff`, DB-target `partition_replace`/`scd2`/`backfill`, Postgres `xmin`, and Postgres/MSSQL `cdc`, without starting services.
`mock_local` matrix	`integration_matrix_mock`	no	Starts disposable local services where possible and validates local/mock pipelines.
Type matrix certification	`type_matrix_certification` / `live-certification.yml profile=type_matrix_certification`	no for contract suite, no external credentials for local Docker profile	Validates MSSQL -> ClickHouse and Postgres -> MSSQL type decisions, physical DDL decisions, LowCardinality, temporal fidelity, schema explain and local-live fixture handoff.
Native transfer certification	`live-certification.yml profile=native_transfer`	no external credentials	Validates critical native transfer route evidence, strategy certification bundle handoff, and native-transfer release evidence pack.
Service-specific integration	`integration_postgres`, `integration_mssql`, `integration_clickhouse`, `integration_kafka`	no for local Docker, yes for external endpoints	Exercises native clients, staging, bulk paths, state, CDC, and connector behavior.
Backfill E2E matrix	`integration_backfill` / `backfill-integration.yml`	no external credentials	Real chunked backfill runs (route x inner-strategy matrix, resume/parallel/verification scenarios, interval-driven Airflow runs) against local Docker services.
`local_live` certification	`live-certification.yml`	no external credentials	Starts local Postgres/MSSQL/ClickHouse/Kafka/MinIO, runs service markers, matrix, `benchmark-slo-gate`, certification pack, and evidence chain.
`real_local` certification	`live-certification.yml`	no external credentials	Runs the local service stack with `DPONE_MATRIX_RUN_MODE=real_local` and produces performance, state/reconciliation, release evidence pack, and evidence chain artifacts for minor/major releases.
Replay integration	`integration_replay`	no by default	Validates replay adapter ordering, injected live-backend contracts, reconciliation, and state commit behavior for `dpone resync` and `dpone resume`.
`vendor_live`	`integration_live` and provider markers	yes	Exercises real managed/vendor systems such as BigQuery or external APIs.

Credentials are only required for vendor_live. The mock_contract, mock_local, and real_local layers use deterministic local test passwords, local containers, or in-process mock servers.

Where tests live¶

Area	Path
Source -> sink strategy matrix	`tests/integration/matrix/`
Matrix registry contracts	`tests/test_integration_matrix_contracts.py`
CI/CD documentation contracts	`tests/test_cicd_docs_contracts.py`
Docs and language contracts	`tests/test_docs_*`
PostgreSQL live/local tests	`tests/integration/postgres/`
MSSQL live/local tests	`tests/integration/mssql/`
ClickHouse live/local tests	`tests/integration/clickhouse/`
Kafka live/local tests	`tests/integration/kafka/`
Backfill E2E matrix and scenarios	`tests/integration/backfill/`
Airflow interval-driven backfill runs	`tests/integration/airflow/`
Replay recovery/control-plane tests	`tests/integration/replay/`
Provider/API mock and live tests	`tests/integration/<provider>/`

Focused Postgres -> MSSQL native transfer coverage lives in tests/integration/mssql/test_postgres_to_mssql_native_transfer_integration.py. It uses the same local Docker Postgres and SQL Server services as the broader matrix, but directly exercises COPY -> mssql-delimited artifact -> bcp -> staging/finalizer with lossless text codec assertions.

Current focused cases:

full_refresh: PostgreSQL COPY into lossless mssql-delimited artifacts, SQL Server bcp, staging-first shadow swap, empty-string/NULL separation, and text values containing newline/tab characters.
incremental_merge: MSSQL default delete_insert finalization with one existing key updated, new keys inserted, and unrelated target rows preserved.
partitioned full_refresh: Spark-like range partitioning with partitioning.export_workers, partitioning.load_workers, deterministic transfer_partition_id, partition bounds metadata, and parallel partition load into MSSQL staging.

Common commands¶

Default release-quality gate:

uv sync --all-extras
uv run ruff check .
uv run ruff format --check .
uv run mypy --config-file mypy.ini
uv run pytest -m "not integration_live" --cov=src/dpone --cov-report=xml
uv build

Docs-only gate:

uv run pytest tests/test_cicd_docs_contracts.py tests/test_docs_mermaid_contracts.py -q
python -m pip install -r docs/requirements.txt
mkdocs build --strict

Credential-free source/sink matrix:

DPONE_RUN_INTEGRATION=1 \
DPONE_RUN_INTEGRATION_MATRIX=1 \
DPONE_MATRIX_RUN_MODE=mock_contract \
DPONE_MATRIX_ARTIFACT_DIR=test_artifacts/integration_matrix/mock_contract_latest \
uv run pytest -m integration_matrix tests/integration/matrix -q

Local/mock source/sink matrix:

DPONE_RUN_INTEGRATION=1 \
DPONE_RUN_INTEGRATION_MATRIX=1 \
DPONE_MATRIX_RUN_MODE=mock_local \
DPONE_MATRIX_ARTIFACT_DIR=test_artifacts/integration_matrix/mock_local_latest \
uv run pytest -m integration_matrix_mock tests/integration/matrix -q

Replay recovery/control-plane gate:

DPONE_RUN_INTEGRATION_REPLAY=1 uv run pytest -m integration_replay tests/integration/replay -q

Focused Postgres -> MSSQL native path:

docker compose -f docker/docker-compose.integration.yml up -d postgres mssql
DPONE_RUN_INTEGRATION=1 \
uv run pytest tests/integration/mssql/test_postgres_to_mssql_native_transfer_integration.py -q

Critical-route type matrix certification:

uv run pytest -m type_matrix_certification tests/test_type_matrix_certification.py -q
gh workflow run "Live certification" -f profile=type_matrix_certification -f row_count=10000

Matrix behavior artifacts¶

When DPONE_MATRIX_ARTIFACT_DIR is set, the matrix writes one metadata artifact and one strategy behavior artifact per selected case. Each behavior artifact uses a configurable deterministic volume profile: 10,000 base source rows by default, 20% changed rows, 5% physical deletes, and 120 wide_* source columns with mixed null sparsity.

Use the __behavior.json artifact to compare full-volume counts/checksums first, then inspect target_before, source_rows, expected_rows, and actual_rows samples when a strategy case turns red.

Benchmark artifacts¶

Use benchmark artifacts when a change touches native bulk transfer, partitioning, or target ingest performance:

Native transfer benchmark artifact: 2026-06-11: latest local-live evidence for 10k, 1M, and 10M rows across Postgres -> MSSQL and MSSQL -> ClickHouse.
Native transfer benchmark artifact: 2026-06-09: previous local-live release suite.
Native transfer benchmark artifact: 2026-06-08: historical local-live evidence and extended tuning context.

Runbook: choose the right layer¶

Use default pytest for every PR.
Use mock_contract when docs, manifests, strategy support, or source -> sink coverage changes.
Use mock_local when connector runtime, staging, type mapping, or load strategy behavior changes.
Use service-specific markers when native clients or local DB/Kafka behavior changed.
Use vendor_live before release or after changes that affect managed external systems.

Runbook: common failures¶

Failure: mock_contract fails.

Check dpone.integration_matrix first; it is the canonical registry.
Check that every pair has a guide in docs/source-sink.
Check Source -> sink matrix and Load strategies links.

Failure: mock_local service does not start.

Re-run docker compose -f docker/docker-compose.integration.yml ps.
Inspect logs for the failing service with docker logs <container>.
For MSSQL, verify the password satisfies SQL Server complexity rules and that the runner has enough memory.
For Kafka, verify kafka becomes healthy before schema-registry starts.

Failure: vendor_live fails.

Verify credentials are present in GitHub Actions secrets or local environment.
Re-run a focused case with DPONE_MATRIX_CASE_ID=<case> when the matrix is involved.
Never copy credentials into artifacts or logs.

Failure: docs links break after moving testing pages.

Keep testing pages under docs/testing/.
Update mkdocs.yml, Documentation index, and README links together.
Run mkdocs build --strict before pushing.

Nested normalization evidence¶

Nested normalization testing covers spill-to-disk, lint, certification, benchmark evidence, reverse readback and guardrail runbooks.

Native transfer certification route evidence¶

live-certification.yml profile=native_transfer creates two route-specific native evidence indexes before building the strategy certification bundle:

postgres-mssql/evidence/.../evidence_index.json
mssql-clickhouse/evidence/.../evidence_index.json

This keeps the release profile symmetric for the two critical routes and avoids shipping a release that only certifies one side of the native transfer work.