Developer strategy intelligence guide¶
dpone.strategy_intelligence is a control-plane package for strategy selection, repair planning, certification metadata, and native fast-path recommendations. It must stay credential-free and side-effect free.
Package map¶
| Module | Responsibility |
|---|---|
dpone.strategy_intelligence.models |
DTOs for decisions, signals, fast paths, repair plans, and certification entries. |
dpone.strategy_intelligence.native_paths |
Static target-native bulk path catalog. |
dpone.strategy_intelligence.advisor |
StrategyAdvisor and StrategyContext selection rules. |
dpone.strategy_intelligence.repair |
RepairPlanService for replay/resume commands. |
dpone.strategy_intelligence.certification |
Source -> sink -> strategy support matrix. |
dpone.strategy_intelligence.compiler |
StrategyAutoCompiler for compiling strategy.mode: auto before LoadConfigBuilder returns. |
dpone.strategy_intelligence.preflight |
Local native-tool availability checks. |
dpone.strategy_intelligence.manifest_reader |
Raw manifest hint reader for advisory CLI. |
dpone.strategy_intelligence.service |
Thin facade for CLI/docs UX. |
Design constraints¶
- No live connector calls.
- No credentials.
- No target writes.
- No imports from runtime source/sink implementations.
- Commands stay thin; business rules belong in
dpone.strategy_intelligenceservices. - Runtime may consume decisions in a later optimizer pass, but v1 is plan/advice only.
StrategyAdvisor¶
StrategyAdvisor converts a StrategyContext into a StrategyDecision:
from dpone.strategy_intelligence.advisor import StrategyAdvisor, StrategyContext
context = StrategyContext(
source_type="postgres",
sink_type="mssql",
requested_mode="auto",
unique_key=("order_id",),
estimated_rows=50_000_000,
changed_percent=0.20,
delete_percent=0.05,
partition_column="business_date",
)
decision = StrategyAdvisor().advise(context)
assert decision.strategy_mode == "partition_replace"
Extension rules¶
When adding a new target-native strategy:
- Add or update the fast path in
NativeFastPathCatalog. - Add an advisor rule only if the signal is deterministic and safe.
- Add a safety gate when state, deletes, locks, or quality semantics can be affected.
- Add certification matrix status and required evidence.
- Add CLI and docs contract tests.
Architecture¶
flowchart LR
CLI["dpone strategy advise"] --> Service["StrategyIntelligenceService"]
Service --> Reader["StrategyManifestReader"]
Service --> Advisor["StrategyAdvisor"]
Advisor --> Paths["NativeFastPathCatalog"]
Service --> Repair["RepairPlanService"]
Matrix["StrategyCertificationMatrixService"] --> Paths
Test requirements¶
Every change must cover:
- direct advisor rules;
- CLI JSON output;
- native path command text;
- repair command ordering;
- certification status for supported and unsupported combinations;
- documentation presence and cross-links.
Runtime optimizer roadmap¶
The current pass compiles strategy.mode: auto before hydration and stores the decision under LoadConfig.options.strategy_intelligence. Remaining runtime optimizer work:
- Runtime validates explicit strategy against advisor warnings before sink load.
- Native path preflight also verifies ODBC, permissions, target staging schema, and table locks.
- Adaptive batching changes batch size during a run based on observed throughput and backpressure.
- Benchmark artifacts feed adaptive batch defaults back into future plans.
- Repair/resync commands execute replay after dry-run approval.
Live execution services¶
| Service | Purpose |
|---|---|
AdaptiveBatchController |
Adjusts batch size from observed rows/sec and target backpressure. |
LivePreflightService |
Aggregates injected target probes for ODBC, permissions, staging schema, and lock risk. |
ReplayExecutionService |
Writes dry-run or approved replay artifacts for dpone resync and dpone resume. |
Probe design¶
LivePreflightProbe is an abstract interface. Concrete live probes must be injected from runtime adapters or tests. This prevents CLI modules from importing MSSQL/Postgres/ClickHouse clients directly.
from dpone.strategy_intelligence.live_preflight import LivePreflightProbe, ProbeCheck
class MssqlProbe(LivePreflightProbe):
def check_odbc(self) -> ProbeCheck:
return ProbeCheck("odbc", True, "ODBC Driver 18 available")
Replay design¶
ReplayExecutionService is artifact-first. The command does not mutate databases by itself. Future live replay adapters should consume the replay artifact, validate staging artifacts again, execute idempotent finalization, run reconciliation, and commit state only after success.
flowchart TD
A["dpone resync/resume"] --> B["ReplayExecutionService"]
B --> C["Replay artifact"]
C --> D["Future live replay adapter"]
D --> E["Validate staging"]
E --> F["Execute finalizer"]
F --> G["Quality reconciliation"]
G --> H["Commit state"]
Replay adapters¶
dpone.strategy_intelligence.replay_adapters contains the execution seam for approved replay.
| Class | Responsibility |
|---|---|
ReplayBackend |
Port for staging validation, finalizer execution, event production, reconciliation, and state commit. |
DbReplayAdapter |
Staging-first sequence for MSSQL, Postgres, ClickHouse, and BigQuery sinks. |
KafkaReplayAdapter |
Event-log replay sequence for Kafka sinks. |
ReplayAdapterRegistry |
Resolves DB vs Kafka adapter by sink type. |
ArtifactOnlyReplayBackend |
Safe default backend that records sequence without external mutation. |
State commit safety is the main invariant: commit_state must run only after staging validation, finalizer/event replay, and reconciliation pass.
flowchart TD
A["ReplayExecutionService"] --> B["ReplayAdapterRegistry"]
B --> C["DbReplayAdapter"]
B --> D["KafkaReplayAdapter"]
C --> E["ReplayBackend.validate_staging"]
E --> F["execute finalizer"]
F --> G["reconcile"]
G --> H["commit state"]
D --> I["produce replay events"]
I --> G
Live backend implementations must be small and sink-specific. Do not add ODBC, Kafka, ClickHouse, or BigQuery client calls to CLI modules or to ReplayExecutionService.
Live replay backend implementations¶
dpone.strategy_intelligence.live_backends contains concrete backend implementations built around injected clients.
| Class/protocol | Purpose |
|---|---|
ReplaySqlClient |
Minimal SQL client protocol: exists, execute, scalar. |
ReplayKafkaClient |
Minimal Kafka client protocol: produce, flush, scalar. |
MssqlReplayBackend |
MSSQL partition switch and delete+insert replay SQL. |
PostgresReplayBackend |
Postgres partition attach/detach contract and delete+insert replay SQL. |
ClickHouseReplayBackend |
ClickHouse REPLACE PARTITION and delete+insert fallback SQL. |
BigQueryReplayBackend |
BigQuery partition-scoped delete+insert replay SQL. |
KafkaLiveReplayBackend |
Kafka replay event production with state commit after reconciliation. |
These classes intentionally depend on injected clients, not on pyodbc, psycopg, clickhouse-driver, BigQuery SDK, or confluent-kafka directly. Runtime/deployment adapters are responsible for wrapping vendor clients into the small protocols.
Example injected clients:
from dpone.strategy_intelligence.live_backends import MssqlReplayBackend
backend = MssqlReplayBackend(
client=my_sql_client,
target_schema="landing",
target_table="orders",
staging_schema="staging",
)
Implementation rules:
- Keep vendor-specific connection creation outside
dpone.strategy_intelligence. - Never commit state before reconciliation passes.
- Keep finalizer SQL generation sink-specific and covered by tests.
- Keep Kafka replay append/event-log only.
- Use artifact-only backend in local dry-run tests unless live credentials are explicitly configured.
Runtime replay client adapters¶
Runtime replay clients adapt existing runtime connectors to replay backend ports:
| Adapter | Input | Output port |
|---|---|---|
RuntimeSqlReplayClient |
MSSQL/Postgres/ClickHouse/BigQuery connector-like object | exists, scalar, execute |
RuntimeKafkaReplayClient |
KafkaConnector connector-like object |
produce, flush, scalar |
RuntimeReplayBackendFactory |
ReplayBackendConnection |
concrete ReplayBackend |
No SDK imports in replay backends¶
Replay backends must not import database or Kafka SDKs. Keep these imports inside runtime connectors:
| SDK | Allowed location |
|---|---|
pyodbc |
dpone.runtime.connectors.mssql |
psycopg |
dpone.runtime.connectors.postgres |
clickhouse_driver |
dpone.runtime.connectors.clickhouse |
confluent_kafka |
dpone.runtime.connectors.kafka |
This keeps dpone.strategy_intelligence import-safe and makes replay execution easy to unit test with injected clients.
Replay evidence writer¶
ReplayEvidenceWriter owns production evidence serialization for replay execution. Keep it separate from backend SQL/Kafka logic:
| Concern | Owner |
|---|---|
| Backend step result | ReplayBackendResult |
| Operation sequencing | ReplayAdapter |
| Raw replay artifact | ReplayExecutionService |
| JSON/Markdown evidence | ReplayEvidenceWriter |
Evidence schema is versioned as dpone.replay.evidence.v1. Add new fields in a backward-compatible way and keep existing field names stable for certification tooling.
Backends should add structured diagnostics under ReplayBackendResult.details:
ReplayBackendResult(
True,
"reconciliation passed",
details={"status_check": {"name": "reconciliation", "status": "passed"}},
)
Use row_count_check only for actual row-count evidence. Do not invent row counts from boolean status probes.