Using the Arc Memory SDK

The Arc Memory SDK provides a powerful interface for building and querying knowledge graphs from your codebase. It embeds a local, bi-temporal knowledge graph (TKG) in your workspace, surfacing verifiable decision trails during code review and exposing the same provenance to LLM-powered agents.

Installation and Setup

1

Check Python Version

Arc Memory requires Python 3.10 or higher and is compatible with Python 3.10, 3.11, and 3.12.

2

Install the SDK

# Using pip
pip install arc-memory

# Or using UV (recommended)
uv pip install arc-memory
3

Authenticate (Optional)

# For GitHub integration
arc auth gh
4

Import and Initialize

from arc_memory import ArcMemory

# Basic initialization
arc = ArcMemory()

# With custom configuration
arc = ArcMemory(
    repo_path="./my-project",
    output_path="./knowledge-graph.db",
    github_token="your_token_here"  # Optional: for GitHub integration
)

Common Use Cases

# Build a complete knowledge graph
arc.build()

# Incremental build (faster for subsequent updates)
arc.build_incremental()

# Build with specific options
arc.build(
    max_commits=1000,
    days=30,
    include_prs=True,
    include_issues=True
)

Best Practices

Performance Optimization

  • Use incremental builds for faster updates
  • Apply specific filters to limit search scope
  • Cache results for frequently accessed data
# Example: Optimized querying
result = arc.trace_file(
    file_path="src/main.py",
    line_number=42,
    max_depth=3,  # Limit depth for faster results
    filters={
        "author": "username",
        "date_after": "2024-01-01"
    }
)

Error Handling

  • Always use try/except blocks
  • Validate inputs before queries
  • Handle rate limits for GitHub operations
from arc_memory.exceptions import GraphBuildError

try:
    arc.build()
except GraphBuildError as e:
    print(f"Build failed: {e}")
    # Handle error appropriately

Advanced Usage

Custom Queries

from arc_memory.plugins import IngestorPlugin
from arc_memory.schema.models import Node, Edge, NodeType, EdgeRel

# Custom graph traversal
results = arc.query(
    start_node_type=NodeType.FILE,
    start_node_id="src/main.py",
    edge_types=[EdgeRel.MENTIONS], 
    max_hops=3  
)

# Process custom query results
for node in results.nodes:
    print(f"Node: {node.id}")
    for edge in node.edges:
        print(f"  Connected to: {edge.target_node.id}")

Working with Large Codebases

For large codebases, consider these strategies:

Integration Examples

CI/CD Integration

from arc_memory import ArcMemory
import os

def update_knowledge_graph():
    arc = ArcMemory(
        repo_path=os.getenv("WORKSPACE_PATH"),
        github_token=os.getenv("GITHUB_TOKEN")
    )
    
    try:
        # Perform incremental build
        result = arc.build_incremental()
        print(f"Updated graph: {result.node_count} nodes")
        
        # Optional: Run analysis
        critical_files = ["src/core/auth.py", "src/core/api.py"]
        for file in critical_files:
            history = arc.trace_file(file)
            # Process history...
            
    except Exception as e:
        print(f"Error updating graph: {e}")
        raise

Automated Code Review

def analyze_pr_changes(pr_number):
    arc = ArcMemory()
    
    # Get PR changes
    changes = arc.trace_pr(pr_number)
    
    # Analyze impact
    impact_score = 0
    for node in changes.nodes:
        if node.type == NodeType.FILE:
            # Check file history
            history = arc.trace_file(node.id)
            impact_score += len(history)
    
    return {
        "impact_score": impact_score,
        "changed_files": len(changes.nodes),
        "related_prs": [pr.id for pr in changes.related_prs]
    }

Next Steps

For more detailed examples, check out these resources:

Arc Memory is designed for high performance, with trace history queries completing in under 200ms (typically ~100μs). For benchmarking details and performance metrics, visit our GitHub repository.