Skip to content

Runtime observability

dpone observability metrics-export converts dpone run evidence into Prometheus text exposition, OpenTelemetry-compatible JSON, a machine-readable summary, and a Markdown report.

The exporter is intentionally local-first and dependency-light. It does not require a running collector, Prometheus server, or vendor SDK to produce artifacts. That keeps CI deterministic while still giving operators files they can ship to their observability stack.

Contents

Quickstart

Run a process and save JSON output:

uv run dpone run examples/postgres_to_mssql.yaml \
  --format json > .dpone/runs/orders/run_report.json

Export observability artifacts:

uv run dpone observability metrics-export \
  --run-report .dpone/runs/orders/run_report.json \
  --output-dir .dpone/observability/orders \
  --label env=local \
  --label pipeline=orders \
  --service-name dpone \
  --namespace dpone.local \
  --format json

Generated files:

File Purpose
prometheus_metrics.prom Prometheus text exposition for scraping or pushgateway upload.
opentelemetry_metrics.json OTLP-shaped JSON that can be adapted by collectors or CI uploaders.
metrics_index.json SHA-256 checksum manifest for the files produced by this export.
runtime_metrics.json dpone export report with metric list and artifact paths.
runtime_metrics.md Human-readable summary for run artifacts or pull requests.

How it works

flowchart LR
    Run["dpone run --format json"]
    Extractor["RuntimeMetricsExtractor"]
    Model["MetricPoint[]"]
    Prom["PrometheusTextRenderer"]
    OTel["OpenTelemetryJsonRenderer"]
    Report["RuntimeMetricsExportReport"]
    Artifacts[".dpone/observability/<run>/"]

    Run --> Extractor
    Extractor --> Model
    Model --> Prom
    Model --> OTel
    Prom --> Artifacts
    OTel --> Artifacts
    Model --> Report
    Report --> Artifacts

The exporter reads the standard JSON shape produced by dpone run. It extracts:

Run field Metric
result.extracted_rows dpone_extracted_rows
result.inserted_rows dpone_inserted_rows
result.updated_rows dpone_updated_rows
result.final_rows dpone_final_rows
result.duration_seconds dpone_duration_seconds
result.throughput_rows_per_second dpone_throughput_rows_per_second
attempts dpone_attempts
max_attempts dpone_max_attempts
retry_backoff_seconds dpone_retry_backoff_seconds
result.errors / errors / blockers dpone_error_count
warnings dpone_warning_count
passed dpone_run_passed

Labels are attached to every metric. Run reports automatically add useful labels such as run_id, process, selector, and status when available.

Prometheus artifact

Example output:

# HELP dpone_extracted_rows Rows extracted by the dpone run.
# TYPE dpone_extracted_rows gauge
dpone_extracted_rows{env="local",process="orders",run_id="01J...",status="success"} 10000

Use this path when you want a simple Prometheus-compatible file that can be:

  • uploaded as a CI artifact;
  • pushed to a Prometheus Pushgateway by a wrapper job;
  • read by a node-local file collector;
  • attached to release evidence.

OpenTelemetry artifact

The OpenTelemetry artifact is an OTLP-shaped JSON payload with:

  • service.name;
  • service.namespace;
  • optional resource attributes passed with --resource-attr key=value;
  • metric names, descriptions, units, labels, and gauge points.

It is intentionally not a direct collector client. Production deployments can choose their own collector path without forcing optional dependencies on local users.

Using custom metrics

Add manual metrics with repeated --metric name=value flags:

uv run dpone observability metrics-export \
  --run-report .dpone/runs/orders/run_report.json \
  --metric throughput_rows_per_second=85000 \
  --metric freshness_lag_seconds=180 \
  --label env=prod \
  --label pipeline=orders

Custom names are normalized to dpone_<name> unless they already start with dpone_.

Add OpenTelemetry resource attributes with repeated --resource-attr key=value flags. Use resource attributes for stable deployment identity and metric labels for low-cardinality run dimensions:

uv run dpone observability metrics-export \
  --run-report .dpone/runs/orders/run_report.json \
  --output-dir .dpone/observability/orders \
  --label pipeline=orders \
  --label strategy=incremental_merge \
  --resource-attr deployment.environment=prod \
  --resource-attr service.version=0.7.1

Prometheus label names are sanitized to Prometheus-compatible names. For example, source.system=postgres is rendered as source_system="postgres".

CI artifact pattern

Recommended CI path:

uv run dpone run "$MANIFEST" --format json > test_artifacts/runs/run_report.json

uv run dpone observability metrics-export \
  --run-report test_artifacts/runs/run_report.json \
  --output-dir test_artifacts/observability/current \
  --label ci_run_id="$GITHUB_RUN_ID" \
  --label branch="$GITHUB_REF_NAME" \
  --format json

Upload the whole test_artifacts/observability/current/ directory even on failure. This preserves metrics for failed runs and makes regression triage much faster.

The repository includes .github/workflows/observability-maturity.yml as the manual and weekly credential-free gate. It runs the observability tests, exports Prometheus/OpenTelemetry artifacts from a synthetic run report, evaluates an SLO smoke check, builds an artifact index, and uploads observability-maturity-report.

Runbook

Symptom Likely cause Action
metrics.empty blocker No run report and no --metric flags were provided. Pass --run-report or at least one --metric name=value.
run_report.missing blocker Path in --run-report does not exist in the current job workspace. Upload/download the run artifact first, or use an absolute path inside CI.
run_report.invalid_json blocker The input file is not JSON output from dpone run --format json. Re-run with --format json and redirect only stdout.
Prometheus labels look wrong The wrapper passed inconsistent labels. Standardize labels in CI: env, pipeline, source, sink, strategy, ci_run_id.
OTel collector rejects the artifact The artifact is OTLP-shaped JSON, not a direct protobuf request. Use a collector/file adapter or a small upload wrapper owned by your platform.
metrics_index.json checksum drift A file was modified after export or an artifact was regenerated out of order. Re-run dpone observability metrics-export and upload the full output directory as one immutable artifact.
observability-maturity.yml is red Tests, metrics export, SLO smoke, or artifact indexing failed. Open observability-maturity-report, inspect runtime_metrics.json, metrics_index.json, and slo_report.json, then reproduce the failing command locally.