Runtime State Backends¶
dpone can store runtime state outside the default BigQuery backend.
Supported OSS state backends:
mssql: XMin checkpoints and run-state rows in SQL Server.postgres: XMin checkpoints and run-state rows in PostgreSQL.
MSSQL state¶
state:
type: mssql
connection_id: mssql_dwh
connection_type: vault
table: {schema: etl_state, name: etl_xmin_state}
If state.connection_id is omitted and the sink is MSSQL, the runtime can reuse the sink connection.
PostgreSQL state¶
PostgreSQL-backed state is useful for local OSS deployments or for teams that want metadata close to a Postgres source.
state:
type: postgres
connection_id: postgres_meta
connection_type: vault
table: {schema: etl_state, name: etl_xmin_state}
Developer API:
from dpone.runtime.state.postgres import PostgresRunStateStorage, PostgresXMinStateStorage
xmin_state = PostgresXMinStateStorage(connector, schema="etl_state", table="etl_xmin_state")
run_state = PostgresRunStateStorage(connector, schema="etl_state", table="etl_run_state")
Both backends expose the same methods used by the runtime:
save_state(source_schema, source_table, xmin_state)load_state(source_schema, source_table)delete_state(source_schema, source_table)save_run_state(run_state)update_run_state(run_state)get_run_state(dag_id, execution_date)get_run_states_by_dag(dag_id, execution_date=None, limit=100)delete_old_states(days_to_keep=30)
Postgres XMin state runbook¶
The detailed XMin guide lives in Postgres XMin. Use it when a Postgres source does not have an application-level updated_at or numeric cursor and you want dpone to persist transaction-ID checkpoints in BigQuery, Postgres, or MSSQL state storage.
The short rule is: set source.options.incremental_strategy: xmin explicitly, or omit source.options.incremental_column in legacy manifests, to use XMin for Postgres incremental extraction. Configure state.type explicitly when the checkpoint must live outside the default BigQuery state backend.