Strategy Intelligence¶
Strategy Intelligence is the dpone control-plane layer for choosing, explaining, repairing, certifying, and optimizing load strategies before a pipeline writes data.
It does not replace explicit manifests. It helps users move from "I picked a strategy by hand" to "dpone explained the safest and fastest target-native plan for this source -> sink pair".
What it covers¶
| Capability | User value | CLI/API |
|---|---|---|
| Strategy Advisor | Selects full_refresh, append, merge, partition_replace, cdc_apply, or other strategies from manifest hints. |
dpone strategy advise |
| Explainable plan | Shows why a strategy was selected and which safety gates apply. | dpone strategy advise --format md |
| Adaptive batching | Suggests initial batch size and parallel workers from row estimates and partition hints. | included in decision payload |
| Repair / resume UX | Produces deterministic repair commands for failed runs. | included in repair_commands |
| Certification matrix | Classifies source -> sink -> strategy support as contract, manual live gate, or not supported. | Python API v1 |
| Native fast paths | Documents target-native bulk commands and fallbacks. | included in decision payload |
| Runtime-native auto compile | Compiles strategy.mode: auto into a concrete LoadStrategy before hydration. |
dpone run, dpone plan |
| Native preflight | Checks whether local fast-path tools are available. | dpone strategy preflight |
| Certification artifacts | Writes source -> sink -> strategy evidence matrix. | dpone strategy certification-artifact |
Strategy Advisor¶
Run the advisor against a manifest:
dpone strategy advise orders.yaml \
--estimated-rows 50000000 \
--changed-percent 0.20 \
--delete-percent 0.05 \
--format md
Example output highlights:
strategy: partition_replace
merge_policy: delete_insert
native_fast_path: postgres_copy_to_mssql_bcp
adaptive_batching: true
Runtime-native mode: auto¶
strategy.mode: auto is compiled before runtime hydration, so sinks receive a concrete LoadStrategy.
For a large partitioned Postgres -> MSSQL load, this can compile to:
The decision is saved in LoadConfig.options.strategy_intelligence, so run artifacts and plan output can explain why the strategy was selected.
Use dpone plan --explain-strategy to see the decision payload:
Native preflight¶
Check whether the local machine has the tools needed for the recommended fast path:
With --fail-on-missing, the command exits non-zero when required tools are missing:
Auto selection rules v1¶
The v1 advisor is deliberately conservative:
| Manifest/runtime signal | Recommended strategy |
|---|---|
CDC is available and unique_key is configured |
cdc_apply |
| Large partitioned delta on DB sink | partition_replace |
unique_key is configured on mutable DB/Kafka target |
incremental_merge |
Cursor exists but no unique_key |
incremental_append |
| No safe incremental hints | full_refresh |
Kafka intentionally rejects partition_replace because it is an event log, not a partitioned mutable table target.
Repair / resume UX¶
When a finalization step fails, Strategy Intelligence returns safe repair steps and actionable commands.
For partition replacement:
Safe runbook:
- Inspect the run report and load IDs.
- Validate staging files/tables and row counts.
- Replay only affected partitions or resume from an idempotent stage.
- Run source-target reconciliation.
- Commit state only after sink success.
CLI:
dpone strategy repair-plan RUN_ID \
--source-type postgres \
--sink-type mssql \
--strategy partition_replace \
--failed-stage finalize \
--partition 2026-01-01 \
--format md
Certification artifacts¶
Write the strategy support matrix as JSON and Markdown:
The artifact records support status for source -> sink -> strategy combinations:
contract_gate: unit/contract evidence is sufficient for normal CI.manual_live_gate: run the local/live integration matrix before claiming certification.not_supported: fail early with a clear diagnostic.
Native fast paths¶
| Source -> sink | Fast path | Required tools | Fallback |
|---|---|---|---|
| Postgres -> MSSQL | postgres_copy_to_mssql_bcp |
psycopg, bcp |
pyodbc_fast_executemany |
| MSSQL -> ClickHouse | mssql_bcp_queryout_to_clickhouse_direct_tsv |
bcp, clickhouse-client |
python_streaming_batches |
| Postgres -> ClickHouse | postgres_copy_to_clickhouse_http_tsv |
psycopg, clickhouse-client or curl |
streaming_rows |
| MSSQL -> MSSQL | mssql_bcp_queryout_to_bcp_import |
bcp |
pyodbc_fast_executemany |
Mermaid flow¶
flowchart TD
A["Manifest hints"] --> B["StrategyContext"]
B --> C["StrategyAdvisor"]
C --> D["StrategyDecision"]
C --> E["NativeFastPathCatalog"]
D --> F["Adaptive batching"]
D --> G["Safety gates"]
D --> H["Repair commands"]
D --> I["Certification matrix evidence"]
Python API¶
from dpone.strategy_intelligence.advisor import StrategyAdvisor, StrategyContext
decision = StrategyAdvisor().advise(
StrategyContext(
source_type="postgres",
sink_type="mssql",
requested_mode="auto",
unique_key=("order_id",),
estimated_rows=50_000_000,
changed_percent=0.20,
delete_percent=0.05,
partition_column="business_date",
)
)
print(decision.strategy_mode)
Troubleshooting¶
| Symptom | Cause | Fix |
|---|---|---|
Advisor returns full_refresh |
Manifest has no cursor, key, CDC, or partition hints. | Add unique_key, incremental_column, CDC config, or partition config. |
Advisor rejects partition_replace for Kafka |
Kafka is not a mutable partitioned table sink. | Use keyed incremental_merge event upserts. |
| Native fast path says fallback | Required local tool is missing. | Run dpone doctor and install the target-specific client. |
| Repair plan is not auto-resumable | Failure happened during finalization. | Replay staged partitions or run resync --plan before state commit. |
Live preflight¶
dpone strategy preflight checks local tools such as bcp and clickhouse-client. The live execution layer also exposes a probe contract for target-specific readiness checks:
- ODBC/driver availability;
- staging schema accessibility;
- target write permissions;
- target lock risk before finalization.
The default OSS probe is plan-only and safe. Live adapters can inject concrete probes for MSSQL, Postgres, ClickHouse, or BigQuery without changing CLI code.
Adaptive runtime batching¶
mode: auto now stores an adaptive batch plan in LoadConfig.options.strategy_intelligence. When a user has not explicitly set batch_size, the initial adaptive batch size becomes LoadConfig.batch_size.
Runtime observations can be fed into AdaptiveBatchController:
from dpone.strategy_intelligence import AdaptiveBatchController, BatchObservation
controller = AdaptiveBatchController(initial_batch_size=50000, min_batch_size=10000, max_batch_size=200000)
adjustment = controller.observe(BatchObservation(rows=50000, duration_seconds=1.0, target_backpressure=0.05))
print(adjustment.next_batch_size)
Rules are intentionally simple in v1:
- increase batch size when throughput is healthy and backpressure is low;
- decrease batch size when target backpressure is high;
- always stay within configured min/max bounds.
Resync and resume commands¶
Repair commands are now first-class top-level CLI commands. They are safe by default: without --yes, they write a replay artifact and do not mark replay as executed.
Plan a partition resync:
dpone resync \
--run-id RUN_ID \
--source-type postgres \
--sink-type mssql \
--strategy partition_replace \
--partition 2026-01-01 \
--format md
Approve execution contract:
dpone resync \
--run-id RUN_ID \
--source-type postgres \
--sink-type mssql \
--strategy partition_replace \
--partition 2026-01-01 \
--yes
Plan a resume from an idempotent failed stage:
dpone resume RUN_ID \
--from-stage stage \
--source-type postgres \
--sink-type mssql \
--strategy incremental_merge
Execution semantics in v1:
mode: plan: write replay artifact only;mode: execute: mark replay artifact as approved/executed after--yes;- live database mutation remains an adapter-level next step, not implicit CLI behavior.
Replay adapters¶
Replay adapters turn an approved replay artifact into a sink-specific execution sequence. They are still dependency-injected: the default OSS backend records an artifact-only execution contract, while live deployments can provide database or Kafka backends.
Database replay sequence:
sequenceDiagram
participant CLI as dpone resync/resume --yes
participant Service as ReplayExecutionService
participant Adapter as DbReplayAdapter
participant Backend as ReplayBackend
participant State as State backend
CLI->>Service: ReplayExecutionRequest
Service->>Adapter: execute(request)
Adapter->>Backend: validate staging
Adapter->>Backend: execute finalizer
Adapter->>Backend: reconcile source and target
Adapter->>State: commit state
Service-->>CLI: artifact with operations
Kafka replay sequence:
sequenceDiagram
participant CLI as dpone resume --yes
participant Adapter as KafkaReplayAdapter
participant Backend as ReplayBackend
participant Kafka as Kafka producer
participant State as State backend
CLI->>Adapter: execute(request)
Adapter->>Backend: validate staging
Adapter->>Kafka: produce replay events
Adapter->>Backend: reconcile produced events
Adapter->>State: commit state
Safety rules:
- Always validate staging before finalization or event production.
- Always reconcile before state commit.
- Never commit state when reconciliation fails.
- Keep Kafka replay append/event-log only; do not emulate partition replacement.
- Record operations in the replay artifact for auditability.
In artifact-only mode, --yes marks the replay contract as executed but does not mutate external systems. Live mutation requires an injected ReplayBackend from a deployment/runtime adapter.
Live replay backends¶
Live replay backends are the sink-specific implementation layer behind approved dpone resync --yes and dpone resume --yes requests. They use injected clients, so dpone core does not create DB/Kafka connections inside CLI commands.
Supported backend contracts:
| Sink | Backend | Finalization path |
|---|---|---|
| MSSQL | MssqlReplayBackend |
partition switch for partition_replace, delete+insert fallback for merge-like replay. |
| Postgres | PostgresReplayBackend |
partition attach/detach contract for partition_replace, delete+insert fallback. |
| ClickHouse | ClickHouseReplayBackend |
native ALTER TABLE ... REPLACE PARTITION ... FROM staging. |
| BigQuery | BigQueryReplayBackend |
partition-scoped delete+insert contract. |
| Kafka | KafkaLiveReplayBackend |
replay event production, no mutable target operations. |
All live backends follow the same safety invariant:
flowchart TD
A["validate staging"] --> B["execute finalizer or produce replay events"]
B --> C["reconcile"]
C --> D{"reconciliation passed?"}
D -- yes --> E["commit state"]
D -- no --> F["fail without state commit"]
The default package still ships an artifact-only backend for safe OSS use. Deployments can inject SQL/Kafka clients that satisfy the backend protocols.
Runtime replay clients¶
Runtime replay clients bridge the control-plane replay contracts to the existing runtime connector and credential layer.
| Class | Role |
|---|---|
ReplayBackendConnection |
Declarative reference to sink type, connection id, target table, staging schema, and credentials source |
RuntimeSqlReplayClient |
Wraps runtime SQL connectors and exposes the narrow exists, scalar, execute replay port |
RuntimeKafkaReplayClient |
Wraps KafkaConnector.create_producer() and exposes produce, flush, scalar |
RuntimeReplayBackendFactory |
Builds MSSQL/Postgres/ClickHouse/BigQuery/Kafka replay backends through existing BaseFactory credential resolution |
Example:
from dpone.strategy_intelligence import ReplayBackendConnection, RuntimeReplayBackendFactory
backend = RuntimeReplayBackendFactory().build(
ReplayBackendConnection(
sink_type="mssql",
connection_id="mssql_dwh",
target_schema="dbo",
target_table="orders",
staging_schema="staging",
credentials_source="env",
)
)
The factory does not import pyodbc, psycopg, clickhouse-driver, or confluent-kafka directly. Those SDKs stay behind runtime connectors and are loaded only when a live connector is actually built.
Native transfer plan details¶
For MSSQL -> ClickHouse, strategy intelligence includes an optional native_transfer_plan inside the decision payload. dpone plan --explain-strategy renders the selected export method, ingest method, finalizer and partitioning strategy.
Example decision fields: