Skip to main content
SDK Only: 30 seconds • Full Training Stack: 10-15 minutes

Choose Your Path

Most users only need the SDK:
python -m pip install --upgrade arc-atlas
This gives you adaptive dual-agent orchestration, telemetry streaming, and data export. Skip to the Verification section after installation.
Only install the full training stack if you need to:
  • Train custom teacher models with GRPO
  • Run offline reinforcement learning
  • Fine-tune models on your own hardware
The training stack requires CUDA-capable GPUs, PyTorch 2.6.0, and vLLM 0.8.3. Most teams use pre-trained teacher models and never need this setup.

System Requirements

Minimum Requirements

  • 2× NVIDIA GPUs with CUDA support (for RL training)
  • 1× GPU minimum for inference only
  • 32GB+ system RAM
  • 100GB+ disk space
  • Python 3.10 or newer

Recommended Setup

  • 4×H100 or 8×H100 GPUs (40GB+ VRAM each)
  • 128GB+ system RAM
  • 200GB+ NVMe storage
  • Ubuntu 22.04 LTS

Prerequisites

Before installing: Run this 30-second check to verify your system meets requirements.
python - <<'EOF'
import sys
import subprocess

checks = []

# Check Python version
py_version = sys.version_info
checks.append(("Python 3.11 or 3.12", py_version >= (3, 11), f"Found {py_version.major}.{py_version.minor}"))

# Check CUDA
try:
    result = subprocess.run(['nvidia-smi'], capture_output=True, text=True)
    cuda_available = result.returncode == 0
    checks.append(("NVIDIA GPU", cuda_available, "Found" if cuda_available else "Not found"))
except:
    checks.append(("NVIDIA GPU", False, "nvidia-smi not available"))

# Check disk space
import shutil
stat = shutil.disk_usage("/")
free_gb = stat.free / (1024**3)
checks.append(("200GB+ free disk", free_gb >= 200, f"{free_gb:.1f}GB free"))

# Print results
print("\nPrerequisites Check:")
print("-" * 50)
for name, passed, detail in checks:
    status = "✅" if passed else "❌"
    print(f"{status} {name}: {detail}")

all_passed = all(c[1] for c in checks)
print("-" * 50)
if all_passed:
    print("✅ All checks passed! Proceed with installation.")
else:
    print("❌ Some checks failed. Review requirements before installing.")
    sys.exit(1)
EOF
Expected output:
Prerequisites Check:
--------------------------------------------------
✅ Python 3.11 or 3.12: Found 3.11
✅ NVIDIA GPU: Found
✅ 200GB+ free disk: 245.3GB free
--------------------------------------------------
✅ All checks passed! Proceed with installation.
SDK-only users can skip this. This check is only needed for the full training stack (Atlas Core).
1

Set up CUDA

Ensure NVIDIA drivers and CUDA are installed and compatible with PyTorch 2.6.0:
nvidia-smi  # Verify CUDA version
2

Python Environment

Verify Python version (3.10 or newer required):
python --version
3

Authenticate with HuggingFace

Authenticate for model and dataset access:
huggingface-cli login

Installation Methods

python -m pip install --upgrade arc-atlas
Keep credentials such as ANTHROPIC_API_KEY in a .env file and load them before orchestrating runs. Atlas defaults to Anthropic as the primary provider.
After the package installs, bootstrap your project with autodiscovery:
atlas env init --task "Summarize the latest AI news"
atlas run --config .atlas/generated_config.yaml --task "Summarize the latest AI news"
The CLI writes .atlas/discover.json, optional factory scaffolds, and metadata snapshots while automatically loading .env and extending PYTHONPATH. atlas env init now handles storage setup automatically—no need to run atlas init separately. Re-run atlas env init --scaffold-config-full whenever you want a fresh runtime configuration derived from discovery output.

Configure Environment

API Keys

# Training stack
export HF_TOKEN="your-huggingface-token"
export WANDB_API_KEY="your-wandb-key"  # Optional

# Runtime SDK
export ANTHROPIC_API_KEY="sk-ant-your-key"  # Primary provider
export GEMINI_API_KEY="your-gemini-key"  # Optional for rewards
Store secrets in .env. The Atlas CLI loads .env automatically and extends PYTHONPATH with your project root and src/ directory.

Disable Tracking

To disable Weights & Biases tracking:
# In command line
python train.py report_to=null

# Or in config file
report_to: null

Verification

After installation, verify your setup:

3-Minute Smoke Test

Run this once to confirm CUDA, vLLM, and model downloads are working before you invest in longer training jobs.
python - <<'PY'
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

# Load teacher model
teacher = AutoModelForCausalLM.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking",
    device_map="auto",
    torch_dtype=torch.float16
)
teacher_tokenizer = AutoTokenizer.from_pretrained(
    "Arc-Intelligence/ATLAS-8B-Thinking"
)

print("CUDA available:", torch.cuda.is_available())
print("GPU count:", torch.cuda.device_count())
print("Teacher model loaded:", teacher.config.model_type)
print("Model device:", next(teacher.parameters()).device)
PY
Expected output:
CUDA available: True
GPU count: 8
Teacher model loaded: qwen2
Model device: cuda:0
# Verify core dependencies
import torch
import transformers
import datasets
import vllm

print(f"PyTorch: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
print(f"GPU count: {torch.cuda.device_count()}")
print(f"Transformers: {transformers.__version__}")
print(f"vLLM: {vllm.__version__}")

GPU Memory Management

For different GPU configurations:
Single GPU is supported for inference only. For RL training, use model offloading:
# Inference only with single GPU
python examples/quickstart/evaluate.py  # Quick evaluation test

# For training with limited VRAM (requires 2+ GPUs)
python train.py +offload

# Or use Zero-1 optimization
python train.py +zero1
For distributed training across multiple GPUs:
# Minimum 2 GPUs for RL training (1 for vLLM, 1 for training)
scripts/launch_with_server.sh 1 1 configs/run/teacher_rcl.yaml

# Production setup with 4 GPUs (2 for vLLM, 2 for training)
scripts/launch_with_server.sh 2 2 configs/run/teacher_rcl.yaml

# Full 8 GPU setup
scripts/launch_with_server.sh 4 4 configs/run/teacher_rcl.yaml
Reduce memory usage with these settings:
# In config file
per_device_train_batch_size: 1
gradient_checkpointing: true
fp16: true  # or bf16 for A100/H100

Security Best Practices

Follow these security guidelines to protect sensitive information:
  • Never commit secrets: Keep tokens, .env files, and API keys out of version control
  • Use environment variables: Store HF_TOKEN, WANDB_API_KEY, etc. as environment variables
  • Gitignore protection: Ensure results/, logs/, wandb/ remain in .gitignore
  • Least privilege: Restrict dataset access permissions
  • Logout on shared machines: Run huggingface-cli logout after use

Platform-Specific Notes

Tested on Ubuntu 20.04/22.04 LTS:
  • Ensure CUDA toolkit matches PyTorch requirements
  • May need sudo for system package installations

Troubleshooting

If you see CUDA errors:
# Check CUDA version
nvidia-smi
nvcc --version

# Reinstall PyTorch with correct CUDA version
pip install torch==2.6.0 --index-url https://download.pytorch.org/whl/cu118  # For CUDA 11.8
Reduce memory usage:
# Use gradient checkpointing
python train.py gradient_checkpointing=true

# Reduce batch size
python train.py per_device_train_batch_size=1

# Enable CPU offloading
python train.py +offload
Ensure proper authentication:
# Re-authenticate
huggingface-cli logout
huggingface-cli login

# Verify token
huggingface-cli whoami
Common vLLM issues:
# Install build dependencies
sudo apt-get install python3-dev

# Try pre-built wheel
pip install https://github.com/vllm-project/vllm/releases/download/v0.8.3/vllm-0.8.3-cp311-cp311-linux_x86_64.whl

Next Steps