Skip to content

Schema contracts

Schema contracts let users define portable logical column types before dpone renders target-specific DDL. They are the right place for business-critical columns, sparse columns, financial decimals, timestamps, and fields where sampled inference is not acceptable.

Basic contract

schema_contract:
  enforcement: strict
  columns:
    amount:
      type: decimal
      precision: 18
      scale: 4
      nullable: false
    updated_at:
      type: timestamp
      timezone: true
      nullable: false
    payload:
      type: json
      nullable: true

Enforcement modes

Mode Behavior
strict Fail before or during load when values do not match the contract.
coerce Attempt explicit value conversion before staging.
quarantine Route bad rows to __dpone__quarantine and keep the target clean.
warn Continue best-effort and emit run artifact warnings.

Production default is strict. For dirty APIs and files, use quarantine. Runtime behavior is implemented by Runtime data contracts. That layer decides which rows are safe to stage, which rows go to quarantine, and whether state can advance after the load.

Type conflict policies

type_inference.conflict_policy handles conflicts between source values, contracts, and target shape:

Policy Behavior
fail Stop before target writes.
variant_column Route incompatible values into __dpone__nc__<column>.
quarantine Keep target columns clean and store rejected rows with diagnostics.

Example:

sink:
  options:
    type_inference:
      conflict_policy: variant_column
    schema_evolution:
      data_type: variant_column

If amount was decimal(18,2) and source starts sending JSON/text payloads, dpone creates or reuses __dpone__nc__amount. The old amount column remains untouched.

Target-specific type override

Use logical contracts for portable semantics and physical overrides for one specific sink:

sink:
  options:
    physical_design:
      columns:
        amount:
          target_type:
            mssql: decimal(18,4)
            postgres: numeric(18,4)
            clickhouse: Decimal(18,4)
            bigquery: NUMERIC

The override wins over inference. If it would narrow an existing target column, the online DDL governance layer blocks it unless a safe-window or manual approval flow is configured.

CLI

dpone schema infer --manifest manifests/orders.batch.yaml --format md
dpone schema physical-plan --manifest manifests/orders.batch.yaml --format json
dpone plan manifests/orders.batch.yaml --selector public.orders --format json

Runbook

Change Recommended action
Add a sparse field Add it to schema_contract.columns so it appears before first non-null value.
Currency amount Use decimal with explicit precision and scale.
Timestamp from API Set type: timestamp and timezone intentionally.
Source type changed incompatibly Prefer fail; use variant_column only for planned downstream migration.
Bad rows should not block ingestion Use enforcement: quarantine and monitor quarantine artifacts.