Physical design¶
Physical design controls target-specific DDL: concrete column types, partitioning, indexes, clustering, table engines, compression, and storage settings. It complements schema contracts: contracts are portable, physical design is sink-specific.
Configuration¶
sink:
options:
physical_design:
enabled: true
mode: auto
apply: online
columns:
status:
target_type:
clickhouse: LowCardinality(String)
partitioning:
strategy: auto
column: business_date
granularity: month
indexes:
strategy: auto
primary_key: [order_id]
storage:
mssql:
compression: page
clustered_columnstore: auto
postgres:
fillfactor: 90
clickhouse:
engine: MergeTree
partition_by: toYYYYMM(business_date)
order_by: [business_date, order_id]
low_cardinality:
mode: auto
bigquery:
partition_by: business_date
clustering: [customer_id, status]
Apply modes¶
| Mode | Behavior |
|---|---|
online |
Apply only low-risk online-safe changes automatically. |
safe_window |
Allow planned blocking DDL inside a maintenance window. |
plan_only |
Render DDL and risks, do not apply. |
manual_approval |
Require approval artifact before execution. |
New target tables can receive the full physical design. Existing targets only receive online-safe changes automatically. The execution decision is handled by Physical DDL apply runtime, keeping planning, approval, and connector-specific execution testable.
Target examples¶
MSSQL¶
sink:
options:
physical_design:
indexes:
primary_key: [order_id]
storage:
mssql:
compression: page
clustered_columnstore: false
Planned SQL shape:
CREATE TABLE [landing].[orders] (...);
ALTER TABLE [landing].[orders] REBUILD WITH (DATA_COMPRESSION = PAGE);
CREATE INDEX [ix_landing_orders_order_id] ON [landing].[orders] ([order_id]);
Compression rebuilds are blocking and require safe_window or
manual_approval on existing tables.
Postgres¶
Planned SQL uses CREATE INDEX CONCURRENTLY for existing-table index paths
where possible.
ClickHouse¶
sink:
options:
physical_design:
storage:
clickhouse:
engine: MergeTree
partition_by: toYYYYMM(business_date)
order_by: [business_date, order_id]
low_cardinality:
mode: auto
max_distinct_values: 10000
max_distinct_ratio: 0.05
BigQuery¶
sink:
options:
physical_design:
storage:
bigquery:
partition_by: business_date
clustering: [customer_id, status]
BigQuery physical design is planned as table creation or controlled recreation. Unsafe changes to existing partitioning/clustering are not applied silently.
ClickHouse LowCardinality modes¶
| Mode | Behavior |
|---|---|
off |
Never generate LowCardinality. |
auto |
Use profiler cardinality thresholds for string-like columns. |
explicit |
Apply only to configured columns. |
force |
Apply to configured columns even when profiler disagrees, with warning. |
preserve |
Keep existing/source LowCardinality, but do not generate new ones. |
Auto mode requires a string-like column and a stable low-cardinality profile:
distinct_count <= 10000distinct_ratio <= 0.05- column name does not look like an identifier, UUID, email, URL, hash, or token
CLI¶
dpone schema physical-plan --manifest manifests/orders.batch.yaml --format md
dpone plan manifests/orders.batch.yaml --selector public.orders --format json
Runbook¶
| Symptom | Action |
|---|---|
| Index/compression DDL is blocked | Switch to safe_window or create an approval artifact. |
| ClickHouse query is slow after load | Review ORDER BY and partition cardinality. |
| LowCardinality hurts performance | Set low_cardinality.mode: off or move the column out of columns. |
| BigQuery partition change is required | Use a shadow/recreate plan, then validate downstream permissions. |
| MSSQL table is write-heavy | Prefer row compression or no compression over page compression. |