Available Models
ATLAS provides pre-trained teacher models optimized for different tasks. All models are 8B parameters and trained using the GRPO algorithm with adaptive teaching objectives.Teacher Models
ATLAS-8B-Thinking
View on Hugging Face
Optimized for mathematical and logical reasoning tasks
- Mathematical problem solving
- Logical reasoning
- Abstract thinking tasks
- Scientific analysis
- Base model: Qwen2.5-7B-Instruct
- Training method: SFT → GRPO
- Specialization: Reasoning-heavy tasks
ATLAS-8B-Instruct
View on Hugging Face
Optimized for code generation and technical instruction
- Code generation
- Technical documentation
- System administration
- API integration
- Base model: Qwen2.5-7B-Instruct
- Training method: SFT → GRPO
- Specialization: Instruction-following and coding
Model Selection Guide
Choose the appropriate teacher model based on your task:Task Type | Recommended Model | Reasoning |
---|---|---|
Math problems | ATLAS-8B-Thinking | Specialized in step-by-step reasoning |
Debugging | ATLAS-8B-Instruct | Better at code understanding |
Data analysis | ATLAS-8B-Thinking | Strong analytical capabilities |
API development | ATLAS-8B-Instruct | Trained on technical documentation |
Logic puzzles | ATLAS-8B-Thinking | Abstract reasoning focus |
DevOps tasks | ATLAS-8B-Instruct | System administration expertise |
Compatible Student Models
ATLAS teachers can enhance any instruction-following LLM: Tested Student Models:- Qwen/Qwen3-4B-Instruct (4B)
- meta-llama/Llama-3.2-8B-Instruct (8B)
- mistralai/Mixtral-8x7B-Instruct-v0.1 (47B)
- OpenAI GPT-4 (API)
- Anthropic Claude (API)
- Instruction-following capability
- Context window ≥4K tokens
- Support for system prompts (preferred)
Memory Requirements
Estimated VRAM usage for inference:Configuration | VRAM Required | Recommended Hardware |
---|---|---|
Teacher only (FP16) | 16GB | RTX 4080, A5000 |
Teacher + Small Student | 24GB | RTX 4090, A6000 |
Teacher + Large Student | 40GB+ | A100, H100 |
Quantized (INT8) | 8GB | RTX 3080, A4000 |
Quantized (INT4) | 4GB | RTX 3070, T4 |
Model Versioning
All models follow semantic versioning:- Latest stable: No suffix (recommended for production)
- Experimental:
-experimental
suffix - Specific versions:
-v1.0
,-v1.1
, etc.
- Training data details
- Performance benchmarks
- Known limitations
- Update changelog
Custom Model Training
Train your own teacher models:License and Usage
All ATLAS models are released under Apache 2.0 license for both research and commercial use. Responsible Use:- Verify outputs for critical applications
- Monitor for potential biases
- Respect base model licenses
- Cite ATLAS in publications