Safety Controls in the Runtime Config
Add or refine theruntime_safety block in your SDK YAML:
| Parameter | Default | Effect | 
|---|---|---|
| drift.enabled | true | Toggle statistical drift detection based on reward deltas. | 
| drift.window | 50 | Samples used for baseline statistics (increase for noisy domains). | 
| drift.z_threshold | 3.0 | Standard deviations required to raise a drift alert. | 
| drift.min_baseline | 5 | Minimum samples before alerts trigger. | 
| review.require_approval | true | Gate exports and learning updates on reviewer approval. | 
| review.default_export_statuses | ["approved"] | Review states included when tooling omits explicit filters. | 
sessions.metadata["drift"] and trigger flags in the review CLI. Review settings feed directly
into arc-atlas export / atlas train, so production pipelines default to approved sessions only.
Review Workflow
- 
Approve or quarantine sessions
The listing groups sessions by review status and highlights drift alerts, reward deltas, and uncertainty changes so reviewers can triage quickly.
- 
Export only the data you trust
Omit--include-statusto inheritruntime_safety.review.default_export_statuses. For local testing, setATLAS_REVIEW_REQUIRE_APPROVAL=0to bypass the gate—never disable it in production.
- Feed the evaluation harnesses The learning evaluation harness counts review statuses in its summaries. Pending sessions are a signal that human review is still in progress; include or exclude them deliberately when comparing runs.
Responding to Drift
- Alert inspection – Review the driftobject inarc-atlas review sessionsoutput. It contains z-scores, deltas, and reason strings pointing at the underlying metric.
- Pause updates – Temporarily disable playbook persistence by setting learning.update_enabled=false; this keeps existing guidance in place while you investigate.
- Re-run evaluation – Use scripts/eval_learning.py --learning-key <key>to confirm the issue and gather context for root-cause analysis.
- Rollback – If a playbook caused the regression, reset it by deleting the entry from learning_registryor restoring a previously exported pamphlet, then re-enable updates.
Database Signals to Monitor
- sessions.review_status&- sessions.review_notes– authoritative state for approval.
- sessions.metadata.drift– contains drift z-scores and explanations.
- learning_registry.updated_at– spot stale playbooks that may indicate paused updates.
- trajectory_events.event.event_type– inspect underlying telemetry (e.g.,- reward,- guidance,- validation) when diagnosing regressions.
Database Schema reference for column details and index coverage.
Best Practices
- Automate reviews – Alert on pending sessions that exceed a time threshold or have drift alerts; build lightweight
dashboards from the sessionstable.
- Document decisions – Use --notewhen approving/quarantining so investigators have context later.
- Audited exports – Store export manifests alongside training jobs (timestamp, review statuses included, CLI flags).
- CI safeties – Keep review.require_approval=truein checked-in configs. Only override via env vars inside isolated dev environments.
