ADR-0003: Refactor Simulation Workflow to Smol Agents

Status

Accepted

Date

2023-05-01

Context

The current simulation workflow in Arc Memory is implemented using LangGraph, a graph-based framework for orchestrating LLM-powered workflows. While this implementation has served us well for the initial version, we’ve identified several issues:

Complexity: The current implementation is approximately 1,500 lines in a single file (langgraph_flow.py), making it difficult to maintain and extend.
Limited Visibility: Users have limited visibility into the actual testing that occurs in the sandbox environment, making it difficult to understand and trust the simulation results.
Debugging Challenges: The graph-based nature of LangGraph makes debugging complex workflows challenging, as state transitions are not always intuitive.
Sandboxing Limitations: The current implementation lacks robust sandboxing, which is important for securely executing code during simulations.
Explanation Quality: The explanations generated by the current system lack detailed reasoning and transparency about data sources and assumptions.

We’ve been exploring alternative approaches and have identified Smol Agents as a promising framework that could address these issues.

Decision

We will refactor the simulation workflow from LangGraph to Smol Agents. This refactoring will involve:

Modularizing the Code: Breaking down the monolithic langgraph_flow.py into smaller, focused modules with clear responsibilities.
Implementing Agent-Based Architecture: Creating specialized agents for different aspects of the simulation (diff analysis, sandbox testing, metrics analysis, explanation generation).
Enhancing Sandboxing: Leveraging Smol Agents’ built-in sandboxing options (local, E2B, Docker) for more secure code execution.
Improving Visibility: Providing users with more visibility into the testing process and results.
Enhancing Explanations: Structuring prompts to explicitly request detailed reasoning, data sources, and assumptions.

Consequences

Positive

Improved Maintainability: Smaller, focused modules will be easier to maintain, test, and extend.
Better Security: Native integration with E2B and other sandboxing options will improve the security of code execution during simulations.
Enhanced User Experience: Users will have more visibility into the simulation process and results, increasing trust and understanding.
More Flexible Workflow: The agent-based approach will allow for more flexible and adaptable workflows that can handle a wider range of scenarios.
Simplified Debugging: Python-native control flow and error handling will make debugging easier compared to graph-based state transitions.
Reduced Code Size: The modular approach is expected to result in less code overall, with better separation of concerns.

Negative

Migration Effort: Significant effort will be required to refactor the existing code, potentially delaying other feature development.
Learning Curve: Team members will need to learn the Smol Agents framework, although it is generally considered simpler than LangGraph.
Potential Regressions: The refactoring could introduce regressions or new bugs that will need to be identified and fixed.
Documentation Updates: Existing documentation and examples will need to be updated to reflect the new architecture.

Alternatives Considered

1. Refactor the Existing LangGraph Implementation

We considered refactoring the existing LangGraph implementation to address the identified issues without changing frameworks. This would involve:

Breaking down the monolithic file into smaller modules
Improving the graph structure for better debugging
Enhancing the prompts for better explanations
Adding custom sandboxing

Pros:

Less disruptive change
Leverages existing knowledge and code

Cons:

Would still be constrained by LangGraph’s graph-based architecture
Custom sandboxing would require significant effort
Debugging would still be challenging

2. Use a Different Agent Framework (e.g., LangChain)

We considered using LangChain’s agent framework instead of Smol Agents.

Pros:

More mature and widely used
Extensive documentation and examples
Rich ecosystem of tools and integrations

Cons:

More complex and heavyweight than Smol Agents
Less focused on code execution
Would still require significant refactoring effort

3. Custom Implementation Without a Framework

We considered implementing our own agent architecture without relying on an external framework.

Pros:

Complete control over the implementation
No external dependencies
Could be tailored exactly to our needs

Cons:

Would require significantly more development effort
Would need to reinvent solutions to problems already solved by existing frameworks
Would require more maintenance over time

Implementation Plan

The implementation will be phased:

Phase 1: Core Module Creation

Create the basic module structure
Implement utility functions and shared code
Extract core functionality from langgraph_flow.py into appropriate modules

Phase 2: Agent Implementation

Create agent definitions using Smol Agents
Implement the sandbox integration with E2B
Develop the workflow orchestration

Phase 3: CLI Integration

Update the CLI to use the new workflow
Implement progress reporting and output formatting
Add support for the --memory flag and history subcommand

Phase 4: Testing and Documentation

Write comprehensive tests for each module
Update documentation to reflect the new architecture
Create examples demonstrating the new workflow

References

Smol Agents Documentation
E2B Sandboxing
Simulation Improvements Plan
Issue #234: Simulation History Display Problems
Issue #245: Improve Explanation Quality
PR #256: Initial Prototype of Smol Agents Integration

Getting Started

CLI Commands

API Documentation

Usage Examples

Guides

Architecture Decision Records

Features

Benchmarks

ADR-0003: Refactor Simulation Workflow to Smol Agents

ADR-0003: Refactor Simulation Workflow to Smol Agents

Status

Date

Context

Decision

Consequences

Positive

Negative

Alternatives Considered

1. Refactor the Existing LangGraph Implementation

2. Use a Different Agent Framework (e.g., LangChain)

3. Custom Implementation Without a Framework

Implementation Plan

Phase 1: Core Module Creation

Phase 2: Agent Implementation

Phase 3: CLI Integration

Phase 4: Testing and Documentation

References

Getting Started

CLI Commands

API Documentation

Usage Examples

Guides

Architecture Decision Records

Features

Benchmarks

​ADR-0003: Refactor Simulation Workflow to Smol Agents

​Status

​Date

​Context

​Decision

​Consequences

​Positive

​Negative

​Alternatives Considered

​1. Refactor the Existing LangGraph Implementation

​2. Use a Different Agent Framework (e.g., LangChain)

​3. Custom Implementation Without a Framework

​Implementation Plan

​Phase 1: Core Module Creation

​Phase 2: Agent Implementation

​Phase 3: CLI Integration

​Phase 4: Testing and Documentation

​References

ADR-0003: Refactor Simulation Workflow to Smol Agents

Status

Date

Context

Decision

Consequences

Positive

Negative

Alternatives Considered

1. Refactor the Existing LangGraph Implementation

2. Use a Different Agent Framework (e.g., LangChain)

3. Custom Implementation Without a Framework

Implementation Plan

Phase 1: Core Module Creation

Phase 2: Agent Implementation

Phase 3: CLI Integration

Phase 4: Testing and Documentation

References