ADR-0003: Refactor Simulation Workflow to Smol Agents
Architecture decision record for refactoring the simulation workflow from LangGraph to Smol Agents
ADR-0003: Refactor Simulation Workflow to Smol Agents
Status
Accepted
Date
2023-05-01
Context
The current simulation workflow in Arc Memory is implemented using LangGraph, a graph-based framework for orchestrating LLM-powered workflows. While this implementation has served us well for the initial version, we’ve identified several issues:
-
Complexity: The current implementation is approximately 1,500 lines in a single file (
langgraph_flow.py
), making it difficult to maintain and extend. -
Limited Visibility: Users have limited visibility into the actual testing that occurs in the sandbox environment, making it difficult to understand and trust the simulation results.
-
Debugging Challenges: The graph-based nature of LangGraph makes debugging complex workflows challenging, as state transitions are not always intuitive.
-
Sandboxing Limitations: The current implementation lacks robust sandboxing, which is important for securely executing code during simulations.
-
Explanation Quality: The explanations generated by the current system lack detailed reasoning and transparency about data sources and assumptions.
We’ve been exploring alternative approaches and have identified Smol Agents as a promising framework that could address these issues.
Decision
We will refactor the simulation workflow from LangGraph to Smol Agents. This refactoring will involve:
-
Modularizing the Code: Breaking down the monolithic
langgraph_flow.py
into smaller, focused modules with clear responsibilities. -
Implementing Agent-Based Architecture: Creating specialized agents for different aspects of the simulation (diff analysis, sandbox testing, metrics analysis, explanation generation).
-
Enhancing Sandboxing: Leveraging Smol Agents’ built-in sandboxing options (local, E2B, Docker) for more secure code execution.
-
Improving Visibility: Providing users with more visibility into the testing process and results.
-
Enhancing Explanations: Structuring prompts to explicitly request detailed reasoning, data sources, and assumptions.
Consequences
Positive
-
Improved Maintainability: Smaller, focused modules will be easier to maintain, test, and extend.
-
Better Security: Native integration with E2B and other sandboxing options will improve the security of code execution during simulations.
-
Enhanced User Experience: Users will have more visibility into the simulation process and results, increasing trust and understanding.
-
More Flexible Workflow: The agent-based approach will allow for more flexible and adaptable workflows that can handle a wider range of scenarios.
-
Simplified Debugging: Python-native control flow and error handling will make debugging easier compared to graph-based state transitions.
-
Reduced Code Size: The modular approach is expected to result in less code overall, with better separation of concerns.
Negative
-
Migration Effort: Significant effort will be required to refactor the existing code, potentially delaying other feature development.
-
Learning Curve: Team members will need to learn the Smol Agents framework, although it is generally considered simpler than LangGraph.
-
Potential Regressions: The refactoring could introduce regressions or new bugs that will need to be identified and fixed.
-
Documentation Updates: Existing documentation and examples will need to be updated to reflect the new architecture.
Alternatives Considered
1. Refactor the Existing LangGraph Implementation
We considered refactoring the existing LangGraph implementation to address the identified issues without changing frameworks. This would involve:
- Breaking down the monolithic file into smaller modules
- Improving the graph structure for better debugging
- Enhancing the prompts for better explanations
- Adding custom sandboxing
Pros:
- Less disruptive change
- Leverages existing knowledge and code
Cons:
- Would still be constrained by LangGraph’s graph-based architecture
- Custom sandboxing would require significant effort
- Debugging would still be challenging
2. Use a Different Agent Framework (e.g., LangChain)
We considered using LangChain’s agent framework instead of Smol Agents.
Pros:
- More mature and widely used
- Extensive documentation and examples
- Rich ecosystem of tools and integrations
Cons:
- More complex and heavyweight than Smol Agents
- Less focused on code execution
- Would still require significant refactoring effort
3. Custom Implementation Without a Framework
We considered implementing our own agent architecture without relying on an external framework.
Pros:
- Complete control over the implementation
- No external dependencies
- Could be tailored exactly to our needs
Cons:
- Would require significantly more development effort
- Would need to reinvent solutions to problems already solved by existing frameworks
- Would require more maintenance over time
Implementation Plan
The implementation will be phased:
Phase 1: Core Module Creation
- Create the basic module structure
- Implement utility functions and shared code
- Extract core functionality from
langgraph_flow.py
into appropriate modules
Phase 2: Agent Implementation
- Create agent definitions using Smol Agents
- Implement the sandbox integration with E2B
- Develop the workflow orchestration
Phase 3: CLI Integration
- Update the CLI to use the new workflow
- Implement progress reporting and output formatting
- Add support for the
--memory
flag and history subcommand
Phase 4: Testing and Documentation
- Write comprehensive tests for each module
- Update documentation to reflect the new architecture
- Create examples demonstrating the new workflow
References
- Smol Agents Documentation
- E2B Sandboxing
- Simulation Improvements Plan
- Issue #234: Simulation History Display Problems
- Issue #245: Improve Explanation Quality
- PR #256: Initial Prototype of Smol Agents Integration