Skip to content

Runtime State Backends

dpone can store runtime state outside the default BigQuery backend.

Supported OSS state backends:

  • mssql: XMin checkpoints and run-state rows in SQL Server.
  • postgres: XMin checkpoints and run-state rows in PostgreSQL.

MSSQL state

state:
  type: mssql
  connection_id: mssql_dwh
  connection_type: vault
  table: {schema: etl_state, name: etl_xmin_state}

If state.connection_id is omitted and the sink is MSSQL, the runtime can reuse the sink connection.

PostgreSQL state

PostgreSQL-backed state is useful for local OSS deployments or for teams that want metadata close to a Postgres source.

state:
  type: postgres
  connection_id: postgres_meta
  connection_type: vault
  table: {schema: etl_state, name: etl_xmin_state}

Developer API:

from dpone.runtime.state.postgres import PostgresRunStateStorage, PostgresXMinStateStorage

xmin_state = PostgresXMinStateStorage(connector, schema="etl_state", table="etl_xmin_state")
run_state = PostgresRunStateStorage(connector, schema="etl_state", table="etl_run_state")

Both backends expose the same methods used by the runtime:

  • save_state(source_schema, source_table, xmin_state)
  • load_state(source_schema, source_table)
  • delete_state(source_schema, source_table)
  • save_run_state(run_state)
  • update_run_state(run_state)
  • get_run_state(dag_id, execution_date)
  • get_run_states_by_dag(dag_id, execution_date=None, limit=100)
  • delete_old_states(days_to_keep=30)

Postgres XMin state runbook

The detailed XMin guide lives in Postgres XMin. Use it when a Postgres source does not have an application-level updated_at or numeric cursor and you want dpone to persist transaction-ID checkpoints in BigQuery, Postgres, or MSSQL state storage.

The short rule is: set source.options.incremental_strategy: xmin explicitly, or omit source.options.incremental_column in legacy manifests, to use XMin for Postgres incremental extraction. Configure state.type explicitly when the checkpoint must live outside the default BigQuery state backend.