ADR-0003: Refactor Simulation Workflow to Smol Agents

Status

Accepted

Date

2023-05-01

Context

The current simulation workflow in Arc Memory is implemented using LangGraph, a graph-based framework for orchestrating LLM-powered workflows. While this implementation has served us well for the initial version, we’ve identified several issues:

  1. Complexity: The current implementation is approximately 1,500 lines in a single file (langgraph_flow.py), making it difficult to maintain and extend.

  2. Limited Visibility: Users have limited visibility into the actual testing that occurs in the sandbox environment, making it difficult to understand and trust the simulation results.

  3. Debugging Challenges: The graph-based nature of LangGraph makes debugging complex workflows challenging, as state transitions are not always intuitive.

  4. Sandboxing Limitations: The current implementation lacks robust sandboxing, which is important for securely executing code during simulations.

  5. Explanation Quality: The explanations generated by the current system lack detailed reasoning and transparency about data sources and assumptions.

We’ve been exploring alternative approaches and have identified Smol Agents as a promising framework that could address these issues.

Decision

We will refactor the simulation workflow from LangGraph to Smol Agents. This refactoring will involve:

  1. Modularizing the Code: Breaking down the monolithic langgraph_flow.py into smaller, focused modules with clear responsibilities.

  2. Implementing Agent-Based Architecture: Creating specialized agents for different aspects of the simulation (diff analysis, sandbox testing, metrics analysis, explanation generation).

  3. Enhancing Sandboxing: Leveraging Smol Agents’ built-in sandboxing options (local, E2B, Docker) for more secure code execution.

  4. Improving Visibility: Providing users with more visibility into the testing process and results.

  5. Enhancing Explanations: Structuring prompts to explicitly request detailed reasoning, data sources, and assumptions.

Consequences

Positive

  1. Improved Maintainability: Smaller, focused modules will be easier to maintain, test, and extend.

  2. Better Security: Native integration with E2B and other sandboxing options will improve the security of code execution during simulations.

  3. Enhanced User Experience: Users will have more visibility into the simulation process and results, increasing trust and understanding.

  4. More Flexible Workflow: The agent-based approach will allow for more flexible and adaptable workflows that can handle a wider range of scenarios.

  5. Simplified Debugging: Python-native control flow and error handling will make debugging easier compared to graph-based state transitions.

  6. Reduced Code Size: The modular approach is expected to result in less code overall, with better separation of concerns.

Negative

  1. Migration Effort: Significant effort will be required to refactor the existing code, potentially delaying other feature development.

  2. Learning Curve: Team members will need to learn the Smol Agents framework, although it is generally considered simpler than LangGraph.

  3. Potential Regressions: The refactoring could introduce regressions or new bugs that will need to be identified and fixed.

  4. Documentation Updates: Existing documentation and examples will need to be updated to reflect the new architecture.

Alternatives Considered

1. Refactor the Existing LangGraph Implementation

We considered refactoring the existing LangGraph implementation to address the identified issues without changing frameworks. This would involve:

  • Breaking down the monolithic file into smaller modules
  • Improving the graph structure for better debugging
  • Enhancing the prompts for better explanations
  • Adding custom sandboxing

Pros:

  • Less disruptive change
  • Leverages existing knowledge and code

Cons:

  • Would still be constrained by LangGraph’s graph-based architecture
  • Custom sandboxing would require significant effort
  • Debugging would still be challenging

2. Use a Different Agent Framework (e.g., LangChain)

We considered using LangChain’s agent framework instead of Smol Agents.

Pros:

  • More mature and widely used
  • Extensive documentation and examples
  • Rich ecosystem of tools and integrations

Cons:

  • More complex and heavyweight than Smol Agents
  • Less focused on code execution
  • Would still require significant refactoring effort

3. Custom Implementation Without a Framework

We considered implementing our own agent architecture without relying on an external framework.

Pros:

  • Complete control over the implementation
  • No external dependencies
  • Could be tailored exactly to our needs

Cons:

  • Would require significantly more development effort
  • Would need to reinvent solutions to problems already solved by existing frameworks
  • Would require more maintenance over time

Implementation Plan

The implementation will be phased:

Phase 1: Core Module Creation

  1. Create the basic module structure
  2. Implement utility functions and shared code
  3. Extract core functionality from langgraph_flow.py into appropriate modules

Phase 2: Agent Implementation

  1. Create agent definitions using Smol Agents
  2. Implement the sandbox integration with E2B
  3. Develop the workflow orchestration

Phase 3: CLI Integration

  1. Update the CLI to use the new workflow
  2. Implement progress reporting and output formatting
  3. Add support for the --memory flag and history subcommand

Phase 4: Testing and Documentation

  1. Write comprehensive tests for each module
  2. Update documentation to reflect the new architecture
  3. Create examples demonstrating the new workflow

References