Skip to content

Physical design

Physical design controls target-specific DDL: concrete column types, partitioning, indexes, clustering, table engines, compression, and storage settings. It complements schema contracts: contracts are portable, physical design is sink-specific.

Configuration

sink:
  options:
    physical_design:
      enabled: true
      mode: auto
      apply: online
      columns:
        status:
          target_type:
            clickhouse: LowCardinality(String)
      partitioning:
        strategy: auto
        column: business_date
        granularity: month
      indexes:
        strategy: auto
        primary_key: [order_id]
      storage:
        mssql:
          compression: page
          clustered_columnstore: auto
        postgres:
          fillfactor: 90
        clickhouse:
          engine: MergeTree
          partition_by: toYYYYMM(business_date)
          order_by: [business_date, order_id]
          low_cardinality:
            mode: auto
        bigquery:
          partition_by: business_date
          clustering: [customer_id, status]

Apply modes

Mode Behavior
online Apply only low-risk online-safe changes automatically.
safe_window Allow planned blocking DDL inside a maintenance window.
plan_only Render DDL and risks, do not apply.
manual_approval Require approval artifact before execution.

New target tables can receive the full physical design. Existing targets only receive online-safe changes automatically. The execution decision is handled by Physical DDL apply runtime, keeping planning, approval, and connector-specific execution testable.

Target examples

MSSQL

sink:
  options:
    physical_design:
      indexes:
        primary_key: [order_id]
      storage:
        mssql:
          compression: page
          clustered_columnstore: false

Planned SQL shape:

CREATE TABLE [landing].[orders] (...);
ALTER TABLE [landing].[orders] REBUILD WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX [ix_landing_orders_order_id] ON [landing].[orders] ([order_id]);

Compression rebuilds are blocking and require safe_window or manual_approval on existing tables.

Postgres

sink:
  options:
    physical_design:
      indexes:
        primary_key: [order_id]
      storage:
        postgres:
          fillfactor: 90

Planned SQL uses CREATE INDEX CONCURRENTLY for existing-table index paths where possible.

ClickHouse

sink:
  options:
    physical_design:
      storage:
        clickhouse:
          engine: MergeTree
          partition_by: toYYYYMM(business_date)
          order_by: [business_date, order_id]
          low_cardinality:
            mode: auto
            max_distinct_values: 10000
            max_distinct_ratio: 0.05

BigQuery

sink:
  options:
    physical_design:
      storage:
        bigquery:
          partition_by: business_date
          clustering: [customer_id, status]

BigQuery physical design is planned as table creation or controlled recreation. Unsafe changes to existing partitioning/clustering are not applied silently.

ClickHouse LowCardinality modes

Mode Behavior
off Never generate LowCardinality.
auto Use profiler cardinality thresholds for string-like columns.
explicit Apply only to configured columns.
force Apply to configured columns even when profiler disagrees, with warning.
preserve Keep existing/source LowCardinality, but do not generate new ones.

Auto mode requires a string-like column and a stable low-cardinality profile:

  • distinct_count <= 10000
  • distinct_ratio <= 0.05
  • column name does not look like an identifier, UUID, email, URL, hash, or token

CLI

dpone schema physical-plan --manifest manifests/orders.batch.yaml --format md
dpone plan manifests/orders.batch.yaml --selector public.orders --format json

Runbook

Symptom Action
Index/compression DDL is blocked Switch to safe_window or create an approval artifact.
ClickHouse query is slow after load Review ORDER BY and partition cardinality.
LowCardinality hurts performance Set low_cardinality.mode: off or move the column out of columns.
BigQuery partition change is required Use a shadow/recreate plan, then validate downstream permissions.
MSSQL table is write-heavy Prefer row compression or no compression over page compression.