Skip to main content

Available Datasets

ATLAS provides curated datasets for training adaptive teachers and evaluating system performance.

Primary Dataset

Arc-ATLAS-Teach-v0

View on Hugging Face

Comprehensive teaching interaction dataset for RL training
Purpose: Train teacher models to provide adaptive guidance across diverse tasks Statistics:
  • Total examples: 100,000+ teaching interactions
  • Task domains: Mathematics, reasoning, coding, debugging
  • Formats: SFT and RL training splits
  • Languages: English
Data Schema:
{
  "prompt": "The problem or task requiring solution",
  "ground_truth": "Correct answer or solution",
  "student_response": "Initial student attempt",
  "teaching": "Adaptive guidance provided",
  "enhanced_response": "Student response after teaching",
  "baseline_score": 0.3,
  "with_teaching_score": 0.9,
  "reward": 0.6,
  "problem_id": "unique_identifier",
  "student_level": "weak|moderate|strong",
  "domain": "math|reasoning|code|debug"
}
Loading the Dataset:
from datasets import load_dataset

# Load for supervised fine-tuning
sft_data = load_dataset(
    "Arc-Intelligence/Arc-ATLAS-Teach-v0",
    "sft",
    split="train"
)

# Load for reinforcement learning
rl_data = load_dataset(
    "Arc-Intelligence/Arc-ATLAS-Teach-v0",
    "rl",
    split="train"
)

# Load validation set
val_data = load_dataset(
    "Arc-Intelligence/Arc-ATLAS-Teach-v0",
    "rl",
    split="validation"
)
File Structure:
Arc-ATLAS-Teach-v0/
├── training/
│   ├── sft.jsonl         # Supervised fine-tuning data
│   └── rl.jsonl          # Reinforcement learning data
└── validation/
    └── rl.jsonl          # Held-out validation

Domain-Specific Subsets

Mathematics Subset

Focus: Step-by-step mathematical reasoning Example:
{
  "prompt": "Sarah has 24 apples. She gives 1/3 to her brother...",
  "ground_truth": "12",
  "teaching": "Break down: 1) Calculate 1/3 of 24 = 8..."
}
Filtering:
math_data = dataset.filter(lambda x: x['domain'] == 'math')

Code Generation Subset

Focus: Programming tasks and debugging Example:
{
  "prompt": "Write a function to validate email addresses",
  "ground_truth": "def validate_email(email):...",
  "teaching": "Consider regex pattern, edge cases like..."
}
Filtering:
code_data = dataset.filter(lambda x: x['domain'] == 'code')

SRE/Debugging Subset

Focus: System reliability and debugging scenarios Example:
{
  "prompt": "Service returns 503 errors intermittently",
  "ground_truth": "Check service mesh configuration...",
  "teaching": "Systematic approach: 1) Check Istio configs..."
}
Filtering:
sre_data = dataset.filter(lambda x: x['domain'] == 'debug')

Data Quality Metrics

Coverage Statistics

DomainExamplesAvg LengthUnique Patterns
Mathematics35,000250 tokens500+
Code Generation30,000400 tokens800+
Reasoning25,000300 tokens600+
Debugging10,000350 tokens400+

Performance Baselines

MetricBaselinew/ Dual-Agent LoopImprovement
Accuracy62.3%78.0%+15.7%
Completion69%100%+31%
Token Efficiency100%50%-50%
These figures reflect the closed-loop runtime plus GRPO baseline. Online continual learning now lives in the atlas-sdk runtime if you need task-specific adaptation between offline training runs.

Creating Custom Datasets

To create custom datasets from runtime traces or other sources, see Custom Dataset Creation:
  • Data format requirements
  • Preprocessing JSONL exports from the SDK
  • Postgres-backed workflows
  • Quality validation

Contributing Data

We welcome contributions to improve ATLAS datasets:
  1. Format your data according to the schema
  2. Validate quality using provided tools
  3. Test with models to ensure compatibility
  4. Submit PR with data and documentation
See Contributing Guidelines for details.

License and Citation

Datasets are released under Apache 2.0 license. If you use these datasets, please cite:
@dataset{atlas_teach_v0,
  title={Arc-ATLAS-Teach-v0: Adaptive Teaching Dataset},
  author={Arc Intelligence Team},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/datasets/Arc-Intelligence/Arc-ATLAS-Teach-v0}
}

Next Steps