
Have questions? Chat with the docs using the assistant at the bottom.
What is ATLAS?
ATLAS gives your agents the ability to learn from every task they run in production, allowing your agents to truly evolve by improving reliability, reducing token costs, and building domain expertise over time through persistent memory. ATLAS turns your production environment into a training ground allowing for learning at inference time and preventing your agent’s performance from plateauing. Every interaction feeds a continuous loop: feedback, improvement, and redeployment ensuring that deployment is not the end of learning. How it works: you wrap your agent within your current stack and environment and we handle the complexity of continual learning at scale. The result: higher task success rates, lower token usage, and increased reliability for mission-critical workflows. ATLAS gives you full observability and control over the learning process. The framework handles the orchestration complexity while you retain ownership of your data, models, and training decisions.The Value Proposition
Reduced token costs: ATLAS uses adaptive supervision lanes to allocate expensive reasoning only when needed, and improves policy efficiency over time via reward-guided optimization and teacher checkpoint updates. Increased task success rate: the student and teacher architecture performs real-time coaching and escalation, correcting errors before they impact production workflows. Compounded & transferable knowledge: persistent memory and offline RL turns production traces into learning traces, updating the teacher model so the agent continuously improves post-deployment. Your agent builds a durable library of domain expertise rather than treating deployment as a static endpoint. This results in agents that become cheaper, more accurate, and more reliable the longer they run.How It Works: Closed-Loop Learning System
ATLAS wraps any base model (GPT, Claude, Gemini, open source checkpoints, or your own) with an inference-time closed-loop learning system that observes the agent’s action space in its live environment. The system executes tasks with built-in quality control that reviews every decision, and the Reward System scores the outcome. That signal can immediately trigger retries or feed downstream training jobs. The same loop powers both the runtime SDK (real-time quality control) and the training stack (offline optimization).What ATLAS Provides
ATLAS wraps your existing agent framework with four components that create a complete learning loop:- Reasoning Core: Dual-agent reasoning loop (student + verifying teacher) that guides execution and captures learning signals
 - Reward System: Turns user feedback into dense reward signals (achieves 93.7% accuracy on RewardBench V2)
 - Learning Engine: Uses offline reinforcement learning (GRPO) to update models based on rewards
 - Persistent Memory: Stores all interactions in structured trace files for analysis and retraining
 

ATLAS keeps your agent in a learn–evaluate–update cycle.
Runtime for ML Engineers
- Autodiscovery CLI – Install the SDK (
pip install arc-atlas), runatlas env initto discover your agent/environment pair, and execute tasks withatlas run. The CLI loads.env, scaffolds configs when needed, and records metadata under.atlas/. - Orchestrator loop – Each run triages a task, probes capability, and routes into 
auto,paired,coach, orescalate. The student agent works alongside a verifying teacher while telemetry streams throughatlas.runtime.telemetry. - Telemetry & exports – Persist sessions to Postgres (
storageblock) and export reviewed traces with the CLI (arc-atlas … --include-status approved --output traces.jsonl). Review gating keeps production datasets safe before they feed training. - Learning playbooks – The runtime synthesizes student/teacher playbooks and stores them in 
learning_registry; seeLearning System Architecturefor how playbooks influence future prompts. - Offline training – Feed exported traces into the 
Runtime Traces dataset configand GRPO trainers to ship bespoke teachers without hand-labeling. 
Atlas CLI Reference.
Data Ownership: Atlas never modifies model weights during runtime—only RL training (which you control) updates weights. Trace storage is optional and self-hosted. You own all data.
Runtime vs. Training: Online continual learning (adaptive runtime with dual-agent orchestration) is implemented in the atlas-sdk. Offline RL training (GRPO) is implemented in Atlas Core (this repository).
End-to-End Lifecycle at a Glance
| Stage | Run This | Output | Typical Effort | 
|---|---|---|---|
| Runtime quality control | atlas.core.run(..., stream_progress=True) | Reviewed plan, per-step traces, live reward scores | Minutes | 
| Persist + export | storage: block + arc-atlas --database-url … --include-status approved --output traces.jsonl | JSONL dataset mirroring production behaviour | Minutes | 
| Export + train workflow | scripts/run_offline_pipeline.py | Convert runtime traces into a new teacher checkpoint | Minutes to launch (training time depends on compute) | 
| Custom training | GRPO pipeline | Bespoke teacher checkpoint, ready to deploy | Multi-hour job on GPUs | 
Getting Started: Two Paths
Choose your starting point based on your goal:🔧 Ready to ship code? Start with the 
SDK Quickstart—it walks through installation, configuration, and running your first dual-agent task in minutes.See the Atlas SDK in action: from installation to measurable performance gains across real examples.
| I want to… | Use this Path | Key Docs | 
|---|---|---|
| Orchestrate tasks with a structured runtime loop. | Atlas SDK | SDK Quickstart | 
| Wrap my existing agent in a quality-control loop. | Atlas SDK | BYOA Adapters | 
| Convert runtime traces into GRPO training runs. | Atlas Core | Offline Training Guide | 
| Fine-tune a custom model with RL. | Training & Optimization | Offline Training Guide | 
SDK Runtime Orchestration
Use the Atlas orchestrator to run an existing agent with a closed-loop learning system. Get started in minutes.
Offline Training (Atlas Core)
Convert exported runtime traces into GRPO training jobs, evaluate reward deltas, and ship updated teacher checkpoints.
Research & Resources
Learn more about the methodology and science behind ATLAS:- ATLAS Technical Report (PDF) - Complete methodology, benchmarks, and implementation details
 - Arc Research - Our latest research advancing continual learning systems
 - GitHub Repository - Source code, examples, and issue tracking
 - HuggingFace Models - Pre-trained models
 - Evaluation Harnesses – Scripts for measuring runtime, reward, and learning performance