Safety Controls in the Runtime Config
Add or refine theruntime_safety block in your SDK YAML:
| Parameter | Default | Effect |
|---|---|---|
drift.enabled | true | Toggle statistical drift detection based on reward deltas. |
drift.window | 50 | Samples used for baseline statistics (increase for noisy domains). |
drift.z_threshold | 3.0 | Standard deviations required to raise a drift alert. |
drift.min_baseline | 5 | Minimum samples before alerts trigger. |
review.require_approval | true | Gate exports and learning updates on reviewer approval. |
review.default_export_statuses | ["approved"] | Review states included when tooling omits explicit filters. |
sessions.metadata["drift"] and trigger flags in the review CLI. Review settings feed directly
into arc-atlas export / atlas train, so production pipelines default to approved sessions only.
Review Workflow
-
Approve or quarantine sessions
The listing groups sessions by review status and highlights drift alerts, reward deltas, and uncertainty changes so reviewers can triage quickly.
-
Export only the data you trust
Omit
--include-statusto inheritruntime_safety.review.default_export_statuses. For local testing, setATLAS_REVIEW_REQUIRE_APPROVAL=0to bypass the gate—never disable it in production. - Feed the evaluation harnesses The learning evaluation harness counts review statuses in its summaries. Pending sessions are a signal that human review is still in progress; include or exclude them deliberately when comparing runs.
Responding to Drift
- Alert inspection – Review the
driftobject inarc-atlas review sessionsoutput. It contains z-scores, deltas, and reason strings pointing at the underlying metric. - Pause updates – Temporarily disable playbook persistence by setting
learning.update_enabled=false; this keeps existing guidance in place while you investigate. - Re-run evaluation – Use
scripts/eval_learning.py --learning-key <key>to confirm the issue and gather context for root-cause analysis. - Rollback – If a playbook caused the regression, reset it by deleting the entry from
learning_registryor restoring a previously exported pamphlet, then re-enable updates.
Database Signals to Monitor
sessions.review_status&sessions.review_notes– authoritative state for approval.sessions.metadata.drift– contains drift z-scores and explanations.learning_registry.updated_at– spot stale playbooks that may indicate paused updates.trajectory_events.event.event_type– inspect underlying telemetry (e.g.,reward,guidance,validation) when diagnosing regressions.
Database Schema reference for column details and index coverage.
Best Practices
- Automate reviews – Alert on pending sessions that exceed a time threshold or have drift alerts; build lightweight
dashboards from the
sessionstable. - Document decisions – Use
--notewhen approving/quarantining so investigators have context later. - Audited exports – Store export manifests alongside training jobs (timestamp, review statuses included, CLI flags).
- CI safeties – Keep
review.require_approval=truein checked-in configs. Only override via env vars inside isolated dev environments.