Available Models

ATLAS provides pre-trained teacher models optimized for different tasks. All models are 8B parameters and trained using the GRPO algorithm with adaptive teaching objectives.

Teacher Models

ATLAS-8B-Thinking

View on Hugging Face

Optimized for mathematical and logical reasoning tasks
Best for:
  • Mathematical problem solving
  • Logical reasoning
  • Abstract thinking tasks
  • Scientific analysis
Usage:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking"
)
Training:
  • Base model: Qwen2.5-7B-Instruct
  • Training method: SFT → GRPO
  • Specialization: Reasoning-heavy tasks

ATLAS-8B-Instruct

View on Hugging Face

Optimized for code generation and technical instruction
Best for:
  • Code generation
  • Technical documentation
  • System administration
  • API integration
Usage:
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Instruct",
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Instruct"
)
Training:
  • Base model: Qwen2.5-7B-Instruct
  • Training method: SFT → GRPO
  • Specialization: Instruction-following and coding

Model Selection Guide

Choose the appropriate teacher model based on your task:
Task TypeRecommended ModelReasoning
Math problemsATLAS-8B-ThinkingSpecialized in step-by-step reasoning
DebuggingATLAS-8B-InstructBetter at code understanding
Data analysisATLAS-8B-ThinkingStrong analytical capabilities
API developmentATLAS-8B-InstructTrained on technical documentation
Logic puzzlesATLAS-8B-ThinkingAbstract reasoning focus
DevOps tasksATLAS-8B-InstructSystem administration expertise

Compatible Student Models

ATLAS teachers can enhance any instruction-following LLM: Tested Student Models:
  • Qwen/Qwen3-4B-Instruct (4B)
  • meta-llama/Llama-3.2-8B-Instruct (8B)
  • mistralai/Mixtral-8x7B-Instruct-v0.1 (47B)
  • OpenAI GPT-4 (API)
  • Anthropic Claude (API)
Requirements:
  • Instruction-following capability
  • Context window ≥4K tokens
  • Support for system prompts (preferred)

Memory Requirements

Estimated VRAM usage for inference:
ConfigurationVRAM RequiredRecommended Hardware
Teacher only (FP16)16GBRTX 4080, A5000
Teacher + Small Student24GBRTX 4090, A6000
Teacher + Large Student40GB+A100, H100
Quantized (INT8)8GBRTX 3080, A4000
Quantized (INT4)4GBRTX 3070, T4

Model Versioning

All models follow semantic versioning:
  • Latest stable: No suffix (recommended for production)
  • Experimental: -experimental suffix
  • Specific versions: -v1.0, -v1.1, etc.
Check model cards on Hugging Face for:
  • Training data details
  • Performance benchmarks
  • Known limitations
  • Update changelog

Custom Model Training

Train your own teacher models:
# Using pre-trained as base
scripts/launch.sh 8 configs/run/teacher_sft.yaml \
  model_name_or_path=Arc-Intelligence/ATLAS-8B-Thinking \
  output_dir=my_custom_teacher

# From scratch with your data
scripts/launch.sh 8 configs/run/teacher_sft.yaml \
  model_name_or_path=Qwen/Qwen2.5-7B-Instruct \
  dataset_name=your_dataset
See GRPO Training Guide for detailed instructions.

License and Usage

All ATLAS models are released under Apache 2.0 license for both research and commercial use. Responsible Use:
  • Verify outputs for critical applications
  • Monitor for potential biases
  • Respect base model licenses
  • Cite ATLAS in publications

Next Steps