CLAWS — Open-Source AI Agent Scaffolding Catalog

01 The Big Three

Three frameworks dominate the open-source agent scaffolding space as of April 2026. They share the same shape — a CLI/daemon that turns LLM calls into a programmable agent — but optimize for different things.

OpenClaw

"Android for AI agents"

Started as a weekend project by Austrian developer Peter Steinberger in late 2025 (originally Clawdbot). By April 2026 it crossed 345,000 GitHub stars and moved to independent foundation governance after Steinberger joined OpenAI. Optimizes for ecosystem breadth: 50+ integrations across messaging platforms, every major model provider, and the public ClawHub skills registry.

Stars~345k

BornLate 2025

StrengthBreadth

Background article ↗

Hermes Agent

"The agent that grows with you"

Launched February 2026 by Nous Research (the lab behind the Hermes/Nomos/Psyche model families). Optimizes for depth over breadth — uses FTS5 over SQLite for full-text search across every past session, autonomously writes structured "skill" docs after complex tasks, and integrates with Atropos for RL fine-tuning. Conservative security posture: container hardening, read-only root filesystems, Tirith pre-execution scanner.

Stars~22k

BornFeb 2026

StrengthMemory

GitHub ↗

Claw Code

Clean-room Claude Code rewrite

Created by Sigrid Jin after the Claude Code source map leaked via npm in March 2026. Independent Python + Rust reimplementation of the Claude Code agent harness — not a fork, a clean-room rebuild. Hit 48k+ stars and 56k forks within weeks. Optimizes for raw performance on coding workloads via the Rust runtime; Python orchestration on top.

Stars~48k

BornMar 31 2026

StrengthSpeed

GitHub ↗

CONTEXT

"Scaffolding" here means the harness around the LLM — the CLI, the daemon, the system-prompt management (SOUL.md / agent.yaml in OpenClaw's case), the tool router, the model proxy. The model itself is swappable; the scaffolding is what gives the agent its personality, its memory, and its safety boundaries.

02 OpenClaw vs Hermes — Head-to-Head

The two flagship frameworks chose opposite philosophies. This is the side-by-side that actually matters when you're picking one for a multi-month project.

Dimension	OpenClaw	Hermes Agent
Origin	Peter Steinberger (Austria), late 2025 — solo weekend project that exploded	Nous Research, Feb 2026 — research lab, deliberate launch
GitHub Stars (Apr 2026)	~345,000	~22,000
Tagline	"Android for AI agents"	"The agent that grows with you"
Optimizes For	Ecosystem breadth — many channels, many models	Depth + memory — single deep agent, learns over time
Model Providers	Anthropic, OpenAI, Google, Ollama, xAI, more	Anthropic, OpenAI, OpenRouter; Atropos RL integration
Memory System	External persistence + ClawHub skills registry	FTS5 + SQLite over all past sessions, LLM summarization
Skill / Plugin System	ClawHub (public) — discoverable, installable	Autonomous skill creation — agent writes its own after complex tasks
Multi-Profile / Multi-Instance	Per-channel agent bindings (Discord, Slack, etc.)	v0.6.0 (Mar 30, 2026) introduced isolated profiles per install
Security Posture	Review current advisories, registry trust, and sandboxing before production deployment	Container hardening, read-only roots, Tirith pre-exec scanner
RL / Fine-Tuning	Not built-in — bring your own pipeline	Atropos integration — generate parallel tool-call trajectories
Best For	Multi-channel deployments, broad integration matrix, rapid prototyping	Long-lived single agent, research labs, security-sensitive deployments
Migration Path	Official OpenClaw → Hermes migration tool exists for moving configs

03 The Full Catalog

59 frameworks across four categories. Filter by what you actually want to build. Every card links to a verified GitHub repo (or vendor page for proprietary). Star counts and versions are point-in-time — click through for current state.

Scaffolding Frameworks 10 — the harness that turns LLM calls into a programmable agent

Open-Source AI Agent

OpenHands: AI Agent for Software Development

Open Source Model-Agnostic Sandboxed Environment Active Community Python

⭐ 72k 📦 v1.6.0 · 2026-03-30

OpenHands is an open-source, model-agnostic platform for building and deploying AI agents that interact with digital environments, primarily for software development tasks. It automates repetitive coding, debugging, and project management, allowing developers to focus on complex problem-solving. The platform supports various LLMs and operates within secure, sandboxed Docker containers.

Strengths

Automates repetitive coding, debugging, and project management tasks.
Model-agnostic, supporting proprietary, open-source, and local LLMs.
Operates in a secure, sandboxed Docker environment for code execution.
Provides full observability and intervention capabilities during agent runs.
Supports multi-agent collaboration and extensible agent skills library.

Weaknesses

Susceptible to looping and backtracking on ambiguous issues.
Success is highly sensitive to prompt and environment configuration.
Requires human oversight for complex tasks and broad autonomous product work.
Can struggle with Git operations, CI integration, and non-Python languages.
Running with local LLMs can be resource-intensive and less reliable.

Best For

Automating repetitive coding tasks like code generation and refactoring.
Teams managing large-scale projects with numerous dependencies.
Elite engineering organizations for large-scale code migrations and maintenance.
Academic and industry researchers advancing AI agent research.
Prototyping and quick iteration on small development projects.

Tech & Notes

Written in Python, utilizing asyncio for efficient execution.
Features a client-server architecture with a web UI and backend AI agents.
Uses Docker Runtime for isolated execution; Local and Remote Runtimes also available.
Employs an event-stream architecture for agent-environment interactions.
Supports a plugin system for extending functionality and customizing runtime environments.

Install pip install --upgrade OpenHands openhands GitHub ↗

Open Source — AI Pair Programming

Aider: AI Pair Programming in Your Terminal

Open Source Terminal-Native Git Integration Multi-LLM Support Python

⭐ 44k 📦 v0.86.0 · 2025-08-09 🪪 Apache-2.0

Aider is an open-source, terminal-based AI pair programming tool that integrates directly with a developer's codebase through Git. It functions as an AI coding partner, assisting with writing, editing, and managing code for both new and existing projects. Aider supports a wide range of LLMs and over 100 programming languages, making it versatile across diverse tech stacks.

Strengths

Auto-commits each AI change with descriptive messages, maintaining clean Git history
Directly modifies files within the project, reducing manual copy-pasting errors
Maintains context across multiple files and revisions for complex tasks
Supports a wide range of LLMs (OpenAI, Anthropic, Google, DeepSeek, local via Ollama/LM Studio)
Proficient in refactoring complex and large legacy codebases
Can automatically lint, test, and fix code after making changes

Weaknesses

Steeper learning curve compared to IDE-integrated AI assistants
Variable costs for heavy usage with expensive LLMs
Lacks robust features for team collaboration or shared context management
Does not offer real-time, predictive code suggestions within an IDE

Best For

Developers who prefer a terminal-centric workflow
Rapid prototyping and accelerated software development
Experimenting with different LLMs without vendor lock-in
Teams working on large, complex codebases or migrating legacy code
Automating bash scripts and CI/CD pipelines

Tech & Notes

Primarily Python-based CLI tool
Connects to various LLM APIs for natural language processing and code generation
Maintains context throughout a session and works across multiple files
Experimental support for a split 'Architect' and 'Editor' model approach
Open-source license

Install python -m pip install aider-install aider-install python -m pip install -U --upgrade-strategy only-if-needed aider-chat GitHub ↗

Open-Source AI — Coding Agent

Cline: The Collaborative AI Coder

Open-Source Client-Side BYOK IDE & CLI

⭐ 61k 📦 v3.81.0 · 2026-04-24 🪪 Apache-2.0

Cline is an open-source AI coding agent that operates within IDEs like VS Code and JetBrains, as well as via a CLI. It functions as a collaborative AI coder, assisting with complex software development tasks through natural language interactions. Cline offers code generation, editing, task planning, terminal command execution, and browser interaction capabilities.

Strengths

Handles complex projects and refactoring across large codebases, maintaining consistency.
Structured "Plan & Act" workflow allows user review and refinement before execution.
Deep contextual understanding and management with multi-stage optimization and "Memory Bank."
Flexible model integration supporting various LLM providers (Anthropic, Google, OpenAI, xAI) with "Bring Your Own Key" (BYOK).
Client-side, zero-trust architecture ensures code and prompts remain private and within security perimeters.
Extensible through the Model Context Protocol (MCP) for external tool integration.
Automates GitHub issue analysis and responses within GitHub Actions via CLI.

Weaknesses

Steeper learning curve and longer initial setup time compared to simpler tools.
Potential for high costs due to pay-per-use models for premium AI models.
Users report frequent API errors and rate limits with certain models (e.g., Gemini).
Can "choke" on inline SVGs, leading to slower modifications and increased token usage.
May get stuck in truncated file loops, requiring manual intervention.
External MCP tools require careful security evaluation as they are independent applications.

Best For

Enterprise developers prioritizing control, transparency, and security with enterprise-licensed models.
Individual developers seeking direct access to frontier AI models without vendor lock-in.
Teams requiring context-aware suggestions, automated reviews, and multi-file changes.
Organizations with strict data privacy concerns due to client-side architecture.
Automation and CI/CD integration for recurring checks, updates, and custom workflows.

Tech & Notes

Open-source AI coding agent.
Client-side, zero-trust architecture; code and prompts never touch Cline's servers.
Connects directly to LLM providers: Amazon Bedrock, GCP Vertex, Azure OpenAI, local models.
Model-agnostic, supporting Anthropic (Claude), OpenAI (GPT-4), Google (Gemini), xAI, and others.
Uses treesitter for codebase navigation and syntax tree building.
Integrates as an extension in Visual Studio Code and JetBrains IDEs (IntelliJ IDEA, PyCharm, WebStorm).

Install code --install-extension saoudrizwan.claude-dev GitHub ↗

Open-Source AI

Continue: AI Coding Assistant

Apache 2.0 PR Checks (new primary) VS Code JetBrains Local LLM Support CI/CD Integration

⭐ 33k 📦 v1.2.22-vscode · 2026-03-27 🪪 Apache-2.0

As of 2026: Continue's primary positioning has pivoted to source-controlled AI checks, enforceable in CI — markdown-defined agents at .continue/checks/ that run on every pull request as GitHub status checks (green if clean, red with suggested diff if not). The original VS Code + JetBrains IDE extensions remain maintained as a secondary workflow. The notes below cover the IDE extension's features, which are accurate but no longer the headline product surface.

Continue is an open-source AI coding assistant that integrates directly into VS Code and JetBrains IDEs. It offers AI-powered code generation, completion, modification, and refactoring, providing developers with control and flexibility over their AI-assisted development workflows. The tool supports a wide array of AI models, including local LLMs, and emphasizes privacy and deep customization.

Strengths

Open-source and model-agnostic, supporting various LLMs including local models (Ollama, LM Studio)
Deep IDE integration with VS Code and JetBrains, offering chat, plan, agent, and autocomplete modes
Strong context awareness, gathering information from the entire workspace for relevant responses
Highly customizable through YAML configuration files for models, context, and commands
Supports AI checks on pull requests via CLI for automated code review and quality enforcement

Weaknesses

Inconsistent quality and sometimes mediocre core coding suggestions, even with state-of-the-art models
Users report stability and user experience issues, with the inline chat being described as 'awkward'
Steeper learning curve and complex initial setup/configuration, potentially overwhelming new users
Can experience performance lags with larger codebases or when running local models, requiring more powerful hardware
Occasional 'hallucinations' (inaccurate AI suggestions) require manual verification

Best For

Professional developers and teams prioritizing control, flexibility, and customization in AI coding tools
Enterprises and privacy-conscious developers needing local deployment and air-gapped operation
Teams focused on efficiency and consistent code quality through automated code reviews and agents
Developers seeking to integrate AI into existing IDE workflows without vendor lock-in
Users who want to leverage local LLMs for sensitive codebases or offline development

Tech & Notes

Functions as an IDE extension for VS Code and JetBrains products
Configurable via YAML files (e.g., `.continue/config.yaml`) for models, context, and commands
Supports OpenAI, Anthropic, Mistral AI, Google (Gemini), and local LLMs via Ollama, LM Studio, or llama.cpp
Released under the Apache 2.0 license
Includes a CLI for scripting and automating AI tasks, with an optional cloud service (Continue Hub) for team collaboration

Install

Security note: inspect remote install scripts before running them, and prefer official documentation or package managers where possible.

curl -fsSL https://raw.githubusercontent.com/continuedev/continue/main/extensions/cli/scripts/install.sh | bash npm i -g @continuedev/cli GitHub ↗

Open-source — VS Code Extension

Roo Code

Open Source VS Code Extension Multi-Agent LLM Agnostic

⭐ 23k 📦 v3.53.0 · 2026-04-23 🪪 Apache-2.0

Roo Code is an open-source, AI-powered coding assistant that operates as a Visual Studio Code (VS Code) extension. It functions as an intelligent partner throughout the software development lifecycle, offering capabilities beyond basic autocompletion by orchestrating multiple AI agents directly within the developer's editor. Originating as a fork of the Cline VS Code extension, Roo Code focuses on accelerating development through advanced features and multi-agent workflows.

Strengths

Generates and modifies code from natural language descriptions, performing holistic changes across multiple files.
Assists in debugging, tracing failures, proposing fixes, and optimizing existing code.
Offers role-specific 'Modes' (Code, Architect, Ask, Debug, Custom) for granular control and task focus.
Integrates with various large language models (LLMs) such as OpenAI, Anthropic, Google Gemini, AWS Bedrock, and local models like Ollama.
Executes commands in VS Code's internal terminal, reads outputs, and can detect and fix errors in running applications.
Optimized for performance with large codebases, reducing token usage through diff-based edits and context management.
Provides a checkpoint system for automatic change tracking and 'Boomerang tasks' for coordinating complex workflows.

Weaknesses

Can consume a substantial number of tokens if planning modes are not used effectively, leading to higher costs.
Supports only one session per VS Code window, limiting parallel task management.
Performance and reliability can be inconsistent, varying with codebase complexity, prompt quality, and chosen AI model.
Lacks automated testing unless explicitly prompted, placing responsibility on the user.
Has documented instances of getting stuck in 'fail loops,' requiring a fresh start to tasks.
Lacks built-in features for monitoring usage, costs, access management, and security policies for larger organizations.

Best For

Individual developers seeking lightweight AI assistance within their IDE for scripting, prototyping, and task automation.
Teams prioritizing rapid iteration, modular workflows, and autonomous in-editor development.
Developers interested in exploring and extending the capabilities of AI agents within their development environment.
Organizations that require deep IDE integration and extensive customization options for their AI coding assistant.

Tech & Notes

Operates as a VS Code extension, running directly within the local development environment.
Architecture is centered around AI agent orchestration with specialized 'Modes' collaborating on tasks.
LLM agnostic, allowing integration with various commercial and local large language models.
Users maintain control by approving file changes and command executions.
Offers granular control over information fed into the AI model's context.
Integrates with Model Context Protocol (MCP) servers for extensibility.
Uses a shadow repository for internal checkpoints, while developers use standard Git for primary version control.

Install code --install-extension RooVeterinaryInc.roo-cline GitHub ↗

Princeton NLP Group

SWE-agent: Autonomous Software Engineering Agent — Transforms LLMs into autonomous software engineers for bug fixing and issue resolution.

Open-Source Autonomous Agent Bug Fixing Code Generation SWE-bench → mini-swe-agent for simple cases

⭐ 19k 📦 v1.1.0 · 2025-05-22 🪪 MIT

SWE-agent is an open-source framework that transforms large language models (LLMs) into autonomous software engineering agents. It enables these agents to independently navigate codebases, identify and fix bugs, and resolve issues within real GitHub repositories. The system operates within isolated Docker containers, utilizing a custom Agent-Computer Interface (ACI) for optimized LLM interaction. Note (2026): the SWE-agent team has shifted focus to a simpler sibling, mini-swe-agent (~100 LOC, >74% on SWE-bench Verified), and recommends it for most CLI use cases. SWE-agent itself remains maintained for advanced configurations (multiple tool sets, custom history processors).

Strengths

Autonomously fixes bugs and resolves GitHub issues, including submitting pull requests.
Custom Agent-Computer Interface (ACI) provides simplified and efficient commands for LLMs.
Achieved state-of-the-art performance on benchmarks like SWE-bench and HumanEvalFix.
Isolated execution environment via Docker containers ensures secure and contained operations.
Supports multimodal input, allowing processing of images from GitHub issues.

Weaknesses

Performance significantly drops on more rigorous benchmarks like SWE-Bench Pro and real-world novel issues.
Struggles with complex dependencies, multi-file issues, and large or monorepo setups.
Requires Docker setup, API keys, and a Python environment, presenting a learning curve.
Not designed for production environments, lacking reliability and integration ecosystem.
Can incur costs for API usage with models like GPT-4-Turbo (approx. $2 per issue).

Best For

Automating bug detection and resolution in small to medium-sized repositories.
Researchers and academics experimenting with autonomous software engineering.
Benchmarking the performance of LLMs on software engineering tasks.
Teams focused on specific, well-scoped automation tasks.
Solving offensive cybersecurity tasks with its "EnIGMA" mode.

Tech & Notes

Built atop the Linux shell, allowing LLM access to common Linux commands.
Utilizes Docker for isolated execution environments for each task.
Features SWE-ReX, a runtime interface for interacting with sandboxed shell environments.
Configurable via a single YAML file, designed for hackability.
Requires Python and pip for installation, with `conda` or `venv` recommended for dependency management.

Install git clone https://github.com/SWE-agent/SWE-agent.git && cd SWE-agent conda env create -f environment.yml && conda activate swe-agent && pip install --editable . && ./setup.sh GitHub ↗

Open-source LLM orchestration

LangChain

Python JavaScript/TypeScript LLM Orchestration RAG Agents Active Community

⭐ 135k 📦 v1.3.2 · 2026-04-24 🪪 MIT

LangChain is an open-source orchestration framework designed to simplify the development of applications powered by large language models (LLMs). It provides a structured approach to building, deploying, and managing these applications, abstracting away much of the complexity involved in integrating LLMs with various data sources and tools. Its modular components support diverse applications like chatbots, conversational agents, and question-answering systems.

Strengths

Simplifies creation of complex LLM applications like RAG systems and chatbots.
Offers extensive integrations with various LLM providers, vector stores, and tools.
Modular design provides flexible building blocks for orchestrating multi-step workflows.
Accelerates development and improves productivity by abstracting LLM integration complexities.
Provides a consistent interface for interacting with different LLMs from various providers.
Supports stateful, multi-actor applications with LangGraph for advanced agent orchestration.

Weaknesses

Steep learning curve due to modular design and layers of abstraction.
Frequent updates and unstable APIs often introduce breaking changes.
Documentation can be outdated, inconsistent, or confusing.
Abstractions can lead to dependency bloat and make debugging challenging.
Performance bottlenecks and scalability issues can arise with high request volumes or large datasets.
Security vulnerabilities, including path traversal and SQL injection flaws, have been reported.

Best For

Building sophisticated agents and Retrieval Augmented Generation (RAG) systems.
Integrating LLMs into existing applications and workflows.
Prototyping LLM pipelines and experimenting with different models and prompts.
Creating dynamic, data-responsive, and context-aware applications powered by LLMs.
Developing internal agents and customer-facing copilots.

Tech & Notes

Available as open-source libraries in Python and JavaScript/TypeScript.
Features a modular, layered architecture with `langchain-core` for abstractions and `langchain` for implementations.
Utilizes LangChain Expression Language (LCEL) for composing `Runnables` with built-in async, batch, and streaming support.
Ecosystem includes LangGraph for stateful agent applications, LangSmith for debugging/monitoring, and LangServe for REST API deployment.

Install pip install -U langchain GitHub ↗

LangChain

LangGraph: Stateful AI Agent Workflows

Graph-based Stateful Agents Multi-Agent Orchestration Open Source Python

⭐ 30k 📦 v1.0.11 · 2026-04-24 🪪 MIT

LangGraph is an open-source AI agent framework built on LangChain, designed for building, deploying, and managing complex generative AI agent workflows. It provides a low-level orchestration framework and runtime for creating long-running, stateful agents by leveraging graph-based architectures, enabling dynamic execution paths, loops, and explicit state management.

Strengths

Manages complex, non-linear, and dynamic workflows with loops and conditional branching.
Provides explicit control over workflow state, crucial for long-running processes and retries.
Excels at coordinating multiple AI agents with specialized roles within a single workflow.
Supports Human-in-the-Loop (HIL) integration for human review or modification during execution.
Offers durable execution, allowing agents to persist through failures and resume from where they left off.
Integrates with LangSmith for real-time logging, visual debugging, and monitoring of agent workflows.
Supports streaming output and real-time feedback on execution status.
Considered production-ready and used by companies like Klarna, Uber, and J.P. Morgan.

Weaknesses

Requires more upfront design and boilerplate code compared to simpler frameworks.
Can lead to unmanaged or counterproductive agent looping if not carefully managed, increasing token consumption.
Debugging complex chains can be difficult without LangGraph Studio due to abstraction layers.
May have higher framework orchestration overhead and memory usage, impacting latency and memory consumption at scale.
Users have reported frequent breaking changes with updates, leading to maintenance challenges.
Recent cybersecurity research disclosed vulnerabilities that could expose filesystem data and environment secrets.
Documentation can be incomplete or lag behind the actual code, with outdated examples.

Best For

Teams building multi-turn, adaptive chatbots and conversational AI systems.
Developers creating systems where multiple AI agents collaborate on complex tasks (e.g., robotics, content generation).
Organizations automating intricate business processes like identity verification or financial analysis.
Companies prioritizing reliability, observability, and fine-grained control in production-grade AI systems.
Teams building sophisticated LLM applications that need to learn and improve over time with reflection and enhanced decision-making.

Tech & Notes

Core architecture is based on directed graphs with nodes (logic) and edges (control flow).
Utilizes a stateful graph concept where a shared, central state flows through the workflow.
Low-level orchestration framework and runtime for stateful, long-running workflows.
Built on the LangChain ecosystem but can be used standalone.
Inspired by systems like Pregel and Apache Beam, with a public interface drawing from NetworkX.
Supports both local execution and deployment on managed platforms like LangGraph Platform and Studio.

Install pip install -U langgraph GitHub ↗

Microsoft — Multi-Agent AI

AutoGen

⚠ Maintenance Mode Multi-Agent Open-source Python .NET Code Execution Human-in-the-Loop

⭐ 57k 📦 v0.7.5 · 2025-09-30 🪪 CC-BY-4.0

⚠ Maintenance Mode (per upstream README): AutoGen will not receive new features or enhancements; Microsoft directs new users to Microsoft Agent Framework, with a migration guide for existing users. The notes below describe AutoGen's design as it stands and remain useful context for migration planning.

AutoGen is an open-source framework from Microsoft for building multi-agent AI applications. It enables the creation of systems where multiple AI agents converse with each other to accomplish complex tasks, leveraging large language models (LLMs), tools, and human input. The framework streamlines the orchestration, automation, and optimization of complex LLM workflows by treating tasks as dialogues among multiple agents.

Strengths

Coordinates multiple specialized AI agents for complex tasks
High degree of customization for agent roles, capabilities, and tools
Automated code execution and debugging within multi-agent workflows
Supports human oversight and intervention at critical decision-making points
Asynchronous, event-driven architecture for scalability and robustness

Weaknesses

Steep learning curve for new users
Multi-agent workflows can become unreliable for highly complex tasks (e.g., >3 hops)
High costs and API rate limits due to iterative agent conversations
Challenges with integrating and achieving good performance from open-source LLMs
Less predictable outcomes compared to frameworks with explicit workflow control

Best For

Teams building applications requiring natural conversational flows between autonomous agents
Projects needing both fully autonomous operation and optional human oversight
Use cases involving autonomous code generation, execution, and debugging
Organizations looking to automate business processes by composing teams of specialized agents
Developers and researchers prototyping new ideas and building sophisticated multi-agent applications

Tech & Notes

Supports Python 3.10 or later
Distributed runtime supports agents in different programming languages (Python and .NET currently)
Asynchronous message passing and event-driven runtime
Includes utilities for enhanced LLM inference, API unification, and caching
OpenTelemetry support for observability and debugging

Install pip install -U "autogen-agentchat" "autogen-ext[openai]" GitHub ↗

Open-source — Multi-Agent Orchestration

CrewAI: Collaborative AI Agent Framework

Python Multi-Agent LLM Agnostic Open Source Role-Based

⭐ 50k 📦 v1.14.3 · 2026-04-24 🪪 MIT

CrewAI is an open-source Python framework for orchestrating collaborative teams of autonomous AI agents to tackle complex tasks. It enables the creation of "crews" of AI agents, each with specific roles, goals, and tools, allowing them to work together in a manner similar to human teams. The framework supports sophisticated multi-agent workflows, task and process orchestration, and integration with various tools and LLMs.

Strengths

Facilitates sophisticated multi-agent workflows with role-based agent design
Supports dynamic tool integration, including web searches, file operations, and custom functions
Allows agents to dynamically build and execute their own tools through code execution
Provides real-time tracing for observability and supports automated/human-in-the-loop agent training
Offers a dual-layer architecture with autonomous 'Crews' and structured 'Flows' for orchestration
Designed with a developer-centric approach, offering elegant APIs and customization
Includes a dedicated CLI with `crewai test` and `crewai train` commands for automated scoring and optimization
Supports structured outputs through schemas

Weaknesses

Developing complex agentic flows can be intricate and require significant trial and error
Slow execution speed for complex multi-agent workflows, leading to high computational costs
Multi-agent conversations can generate more tokens than necessary for simple tasks, incurring costs
Challenges in implementing robust planning for complex tasks, with agents getting stuck in hierarchical loops
Less mature ecosystem and community compared to established frameworks like LangChain
Advanced features like memory and knowledge management are tied to local datastores, limiting enterprise scalability
Collects usage patterns and information without an explicit opt-out option, raising privacy concerns
Not well-optimized for open-source models, with difficulties in supporting function and tool-calling features
Lacks a graphical user interface, necessitating coding expertise for workflow creation
Ongoing high-severity bugs reported on GitHub, including tool errors and asynchronous deadlocks

Best For

Projects requiring complex, multi-step problem-solving through collaboration
AI engineers and developers building and customizing agents and workflows
Teams for back-office automation (e.g., research, data analysis, summarization)
Content creation and marketing (e.g., automated research, content pipelines)
Business intelligence tasks requiring multiple specialized skills

Tech & Notes

Python-based framework
LLM Agnostic, supporting models from OpenAI, Anthropic, Mistral, and Amazon Bedrock
Architecture built around Agents, Tasks, Tools, Crews, and Flows
Uses `uv` for dependency management and package handling
Enterprise features (CrewAI AMP/Plus) offer centralized management, monitoring, security, and autoscaling

Install uv tool install crewai crewai create crew <project_name> GitHub ↗

Coding Agents 18 — write, edit, refactor, run code

Aider

terminal pair-programmer · Aider-AI/aider

Terminal AI pair programmer that edits code across multiple files in your local git repo. Auto-commits each AI change with a descriptive message.

Apache-2.0

Python

Git-native

GitHub ↗

Continue

IDE extension · continuedev/continue

Open-source VS Code/JetBrains extension plus CLI ("cn") for AI chat, autocomplete, and source-controlled CI checks. Hub-based config; markdown-defined CI checks enforceable as GitHub statuses.

Apache-2.0

TypeScript

GitHub ↗

Cline

VS Code agent · cline/cline

Autonomous coding agent inside VS Code that creates/edits files, runs commands, and drives a browser — with per-step approval gating. Renamed from Claude Dev.

Apache-2.0

TypeScript

Step approval

GitHub ↗

Roo Code

VS Code multi-mode · RooCodeInc/Roo-Code

Cline fork with multi-mode personas (Architect / Code / Debug). Custom-mode framework lets you define your own agent personalities with scoped permissions.

Apache-2.0

TypeScript

GitHub ↗

OpenHands

Devin-style platform · All-Hands-AI/OpenHands

Largest OSS Devin-style agent (~71k stars). Agents write code, run shells, browse the web in a sandbox. SDK + CLI + GUI + cloud. Formerly OpenDevin.

MIT

Python

~71k stars

GitHub ↗

SWE-agent

issue-to-patch · SWE-agent/SWE-agent

Takes a GitHub issue and tries to autonomously patch the repo. Built around a research-grade Agent-Computer Interface; reference implementation for SWE-bench.

MIT

Python

Princeton NLP

GitHub ↗

Bolt.new

in-browser builder · stackblitz/bolt.new

Prompt-to-app full-stack builder running entirely in the browser via WebContainers. No local install — Node + npm in the browser.

MIT

TypeScript

Browser-only

GitHub ↗

GPT Engineer

spec-to-repo · AntonOsika/gpt-engineer

CLI that takes a natural-language spec and scaffolds + iterates a complete codebase. One of the original spec-to-repo agents — pioneered the genre.

MIT

Python

GitHub ↗

smol-developer

embeddable junior dev · smol-ai/developer

Library/agent that scaffolds an entire codebase from a product spec, or runs as an embedded "junior dev" inside your app. Designed to be embedded, not just CLI.

MIT

Python

GitHub ↗

Plandex

terminal sandbox · plandex-ai/plandex

Terminal AI for large multi-file/multi-step tasks. Cumulative diff-review sandbox — pending changes stay isolated until you apply them.

MIT

Diff-sandbox

GitHub ↗

PearAI

VS Code fork · trypear/pearai-app

Open-source VS Code fork bundling AI chat, inline edits, and codebase awareness out of the box. Cursor-style batteries-included experience as OSS.

Apache-2.0

TypeScript

GitHub ↗

Goose

native agent · block/goose

Native open-source AI agent (desktop app + CLI + API) for coding and arbitrary workflows. 70+ MCP extensions, 15+ LLM providers, runs locally. From Block (Square / Cash App).

Apache-2.0

Rust

MCP-first

GitHub ↗

Crush

TUI coding agent · charmbracelet/crush

"Glamourous" terminal-based agentic coding tool with multi-model and per-project config. Charm-grade TUI polish; mid-session model switching.

FSL-1.1-MIT

TUI

GitHub ↗

Void

OSS Cursor alternative · voideditor/void

Open-source VS Code fork positioned as the OSS Cursor alternative. Same VS Code base; agent + chat + local-model support; bring-your-own model.

Apache-2.0

TypeScript

GitHub ↗

Cursor

closed VS Code fork · Anysphere

The reference proprietary AI editor. VS Code fork with deep AI chat, agent mode, and Composer. Repo is issues-only — closed binary.

Closed

Subscription

Vendor ↗

Devin

cloud SWE · Cognition Labs

Cloud "AI software engineer" running long-horizon tasks autonomously in its own VM. Proprietary SWE-1.5 model family. GA Feb 2026; acquired Windsurf.

Cloud

Proprietary

Vendor ↗

v0 (Vercel)

UI generator · vercel/v0-sdk

SDK for the v0 Platform API to drive AI-generated UI/full-stack code. Only the SDK is OSS; the v0.app generator itself is proprietary.

SDK: Apache-2.0

App: Closed

SDK ↗

Mistral Code

enterprise copilot · Mistral AI

Enterprise VS Code/JetBrains coding copilot bundling Codestral, Devstral, and chat models. Fork of OSS Continue with Mistral's in-house code models. Local-deploy options.

Closed

TypeScript

Vendor ↗

Multi-Agent Orchestration 21 — libraries to build systems where agents collaborate

LangChain

general-purpose · langchain-ai/langchain

The default toolbox. Massive integration surface — chains LLMs, tools, retrieval into apps. Historically criticized for over-abstraction; v0.3+ slimmed down significantly.

~134k stars

MIT

Python

GitHub ↗

LangGraph

graph-based agents · langchain-ai/langgraph

Low-level library for modeling agents as stateful graphs with checkpointing and human-in-the-loop. Explicit graph control over agent flow; durable execution for long runs.

~30k stars

MIT

Python

GitHub ↗

AutoGen

conversational multi-agent · microsoft/autogen

Microsoft Research framework where agents talk to each other and to humans. v0.4 rewrite split into core/agentchat for async/event-driven runtime.

~57k stars

CC-BY-4.0 + MIT

Python

GitHub ↗

CrewAI

role-based crews · crewAIInc/crewAI

Role-based crews collaborating via assigned roles, goals, and processes. Built independent of LangChain; opinionated abstraction popular for business workflows.

~49k stars

MIT

Python

GitHub ↗

smolagents

code-action agents · huggingface/smolagents

Minimal HuggingFace library where agents write Python code to act, instead of emitting JSON tool calls. Tiny core (~1k LOC), Hub-native tool sharing.

~27k stars

Apache-2.0

Code-as-action

GitHub ↗

Pydantic AI

type-safe agents · pydantic/pydantic-ai

Type-safe framework using Pydantic for structured I/O and validation. FastAPI-style ergonomics, dependency injection, model-agnostic, strong typing throughout.

~17k stars

MIT

Type-safe

GitHub ↗

LlamaIndex

RAG-first agents · run-llama/llama_index

Data framework for RAG that has expanded into agentic workflows over indexed documents. Strongest story for document/OCR-heavy agents; "Workflows" event-driven orchestration.

~49k stars

MIT

Python

GitHub ↗

Haystack

production pipelines · deepset-ai/haystack

Pipeline-oriented framework for production LLM/RAG apps with agent components. DAG-style pipelines preferred by enterprise NLP teams; mature evaluation tooling.

~25k stars

Apache-2.0

Python

GitHub ↗

Agno

formerly Phidata · agno-agi/agno

High-performance multi-agent runtime with memory, knowledge, and an "AgentOS" control plane. Renamed from Phidata Jan 2025; pitches raw agent instantiation speed and ops layer.

~40k stars

Apache-2.0

GitHub ↗

Letta

stateful memory agents · letta-ai/letta

Server for stateful agents with persistent long-term memory and self-editing context. Memory IS the product — agents are addressable services with durable state. Formerly MemGPT.

~22k stars

Apache-2.0

Memory-first

GitHub ↗

Agency Swarm

OpenAI SDK extension · VRSEN/agency-swarm

Multi-agent orchestrator built on top of the OpenAI Agents SDK. Thin, opinionated layer with a CEO/worker agency metaphor.

~4k stars

MIT

GitHub ↗

MetaGPT

software-co simulator · FoundationAgents/MetaGPT

Simulates a software company: PM, architect, engineer, QA agents collaborate to ship code. SOP-driven role simulation; one of the earliest "agents-as-software-team" projects.

~67k stars

MIT

Python

GitHub ↗

ChatDev

virtual SDLC · OpenBMB/ChatDev

Virtual software company of LLM agents that design, code, test, document apps from a prompt. Research project pairing waterfall SDLC stages with chat-chain communication.

~33k stars

Apache-2.0

Tsinghua

GitHub ↗

CAMEL

role-play research · camel-ai/camel

Role-playing communicative agent framework focused on studying agent scaling laws. Academic origin; emphasis on synthetic dataset generation and large-scale agent simulations.

~17k stars

Apache-2.0

GitHub ↗

Strands Agents

AWS-backed SDK · strands-agents/sdk-python

Model-driven agent SDK from AWS, used internally in Bedrock-powered services. v1.0 (2025) added A2A protocol support and multi-agent primitives. 14M downloads by early 2026.

~6k stars

Apache-2.0

AWS-backed

GitHub ↗

DSPy

prompt compiler · stanfordnlp/dspy

"Programming, not prompting." Compiler-style framework that optimizes prompts and weights via modules + optimizers (MIPRO, BootstrapFewShot) instead of hand-writing them.

~34k stars

MIT

Stanford NLP

GitHub ↗

Marvin

structured outputs · PrefectHQ/marvin

Library for structured outputs and lightweight agentic workflows. From the Prefect team — treats LLM calls as typed Python functions; integrates with task orchestration.

~6k stars

Apache-2.0

GitHub ↗

Atomic Agents

small primitives · BrainBlend-AI/atomic-agents

Modular framework on Instructor + Pydantic with small composable units. Anti-framework framework — explicit, small primitives, no hidden prompt magic.

~6k stars

MIT

GitHub ↗

Parlant

conversation control · emcie-co/parlant

Conversation harness for customer-facing agents using "guidelines" instead of system prompts. Behavioral guardrails as first-class objects; targets regulated/customer-support deployments.

~18k stars

Apache-2.0

Guardrails-first

GitHub ↗

OpenAI Agents SDK

official primitives · openai/openai-agents-python

Production successor to Swarm. Lightweight multi-agent framework — handoffs, guardrails, tracing. Now the base layer many other libs extend.

~22k stars

MIT

Official

GitHub ↗

Swarms

enterprise topologies · kyegomez/swarms

Enterprise-leaning multi-agent orchestration framework with hierarchical and parallel swarm topologies. Distinct from OpenAI's archived Swarm.

~6k stars

Apache-2.0

GitHub ↗

Browser & Computer-Use 10 — agents that drive a browser, desktop, or GUI

browser-use

Playwright + vision · browser-use/browser-use

Library that makes websites accessible to AI agents by extracting interactive elements and driving Playwright. Vision + DOM hybrid extraction. Fastest-growing browser-agent repo of 2025–2026.

~50k+ stars

MIT

Python

GitHub ↗

Skyvern

workflow automator · Skyvern-AI/skyvern

Automates browser workflows using LLMs and computer vision. Self-healing workflows with vision-first approach. Handles auth/2FA/CAPTCHAs. YC-backed.

AGPL-3.0

Python

Self-healing

GitHub ↗

LaVague

large action model · lavague-ai/LaVague

Large Action Model framework that turns natural-language instructions into Selenium/Playwright code. "World Model + Action Engine" split.

Apache-2.0

Python

GitHub ↗

Stagehand

act/extract/observe · browserbase/stagehand

AI browser automation SDK that augments Playwright with act/extract/observe primitives. TypeScript-native, deterministic Playwright fallback alongside AI actions.

MIT

TypeScript

Browserbase

GitHub ↗

Open Interpreter

code-as-tool · OpenInterpreter/open-interpreter

Lets LLMs run code locally to control your computer through a ChatGPT-like terminal. Local code-execution-as-tool model; runs Python/JS/Shell on host.

~50k+ stars

AGPL-3.0

Python

GitHub ↗

Self-Operating Computer

vision loop · OthersideAI/self-operating-computer

Multimodal models view a screen and operate the mouse/keyboard to reach a goal. Pure vision loop (screenshot → click coords). Model-agnostic (GPT-4o, Gemini, Claude, LLaVa).

MIT

Python

GitHub ↗

Agent S / S2

generalist GUI agent · simular-ai/Agent-S

Open agentic framework that uses computers like a human via compositional generalist-specialist design. Experience-augmented hierarchical planning. Topped OSWorld benchmark.

Apache-2.0

Python

OSWorld SOTA

GitHub ↗

Bytebot

containerized desktop · bytebot-ai/bytebot

Self-hosted AI desktop agent that runs in a containerized Linux desktop you can chat with. Full virtual desktop in Docker — agent has its own OS, not yours.

Apache-2.0

TypeScript

Sandboxed

GitHub ↗

Pig.dev

Windows VMs · pig-dot-dev/piglet

Computer-use API and SDK for controlling Windows VMs with AI agents. Windows-focused — most rivals target Linux/browser. YC-backed.

Apache-2.0

Go / Python

Windows-first

GitHub ↗

Magentic-One

multi-agent orchestrator · microsoft/autogen

Microsoft Research multi-agent: Orchestrator + WebSurfer + FileSurfer + Coder + ComputerTerminal. Lives inside the AutoGen repo. Released Nov 2024.

MIT

Python

GitHub ↗

04 Which One Should I Use?

Pick by the question that matches your situation. The answer column is the framework; the why column is the reason — read both before committing.

"I need it on Slack + Discord + WhatsApp"

→ OpenClaw

50+ messaging integrations are first-class. ClawHub has community channel adapters for almost everything. Hermes is single-agent focused.

"I want it to remember what we did last week"

→ Hermes Agent

FTS5-over-SQLite session search is the headline feature. OpenClaw expects you to bring your own persistence; Hermes ships with deep memory built in.

"I'm running it inside a regulated environment"

→ Hermes Agent

Container hardening, read-only roots, and pre-execution scanning matter most in regulated environments. Verify current advisories, registry trust, and sandbox configuration before production use.

"I want raw coding speed, like Claude Code but open"

→ Claw Code

Rust runtime under a Python orchestration layer. Built specifically as a clean-room rewrite of the Claude Code harness — same shape, fully transparent.

"I'm fine-tuning small models for tool use"

→ Hermes Agent

Atropos RL integration generates thousands of parallel tool-calling trajectories — research-grade pipeline you'd otherwise build from scratch.

"I want the largest ecosystem and community"

→ OpenClaw

15× the GitHub stars of Hermes (~345k vs ~22k). More plugins, more skill authors, more Stack Overflow answers when you hit a wall.

"I just want to wrap my Claude Code subscription"

→ Enderfga/openclaw-claude-code or 13rac1's plugin

Both are thin OpenClaw plugins specifically for Claude Code — the first adds multi-engine routing and a council workflow; the second sandboxes Claude Code in Podman/Docker.

"I want to migrate from OpenClaw to Hermes"

→ Use the official migration tool

A dedicated converter exists for moving OpenClaw configs (agents, skills, bindings) into Hermes profile format. Don't rewrite by hand.

"I want a terminal-native AI pair programmer"

→ Aider

Git-native edits, auto-commits each change with a descriptive message. No IDE lock-in. Most popular terminal pair-programming tool.

"I want VS Code with an autonomous agent inside"

→ Cline (cautious) or Roo Code (multi-mode)

Cline gates every step with human approval — safer for new users. Roo Code adds Architect/Code/Debug personas for switching mindset mid-session.

"I want a Cursor-like editor but fully open source"

→ Void or PearAI

Both are open VS Code forks with batteries included. Void leans pure-OSS BYO-model; PearAI ships more pre-configured.

"I want Devin's autonomous loop but open"

→ OpenHands

~71k stars, by far the largest OSS Devin-style platform. Sandboxed execution, browses the web, runs shells. SDK + CLI + GUI all included.

"I want to auto-fix GitHub issues"

→ SWE-agent

Princeton's reference for SWE-bench. Specifically built around the issue-to-patch loop with a research-grade Agent-Computer Interface.

"I want to chain LLMs + tools + retrieval generally"

→ LangChain (broad) or LangGraph (control)

LangChain is the default toolbox — biggest integration surface. LangGraph trades convenience for explicit graph control and durable execution.

"I want type-safe agents with structured outputs"

→ Pydantic AI

FastAPI ergonomics for agents. Pydantic-validated I/O end-to-end, dependency injection, model-agnostic. Production-grade typing.

"I want role-based teams of agents"

→ CrewAI or Agency Swarm

CrewAI is independent of LangChain with role/goal/process abstractions popular for business workflows. Agency Swarm sits on the OpenAI Agents SDK if you're committed to that stack.

"I want agents whose memory persists for years"

→ Letta (formerly MemGPT)

Memory is the product. Agents are addressable services with durable state and self-editing context, not ephemeral chains.

"I want agents that read documents (RAG-heavy)"

→ LlamaIndex

Strongest story for OCR / document-heavy RAG. Workflows API gives event-driven orchestration over indexed data.

"I want agents that write Python code instead of JSON"

→ smolagents (HuggingFace)

Code-action model: agents emit Python instead of JSON tool calls. Tiny core (~1k LOC), Hub-native tool sharing.

"I want to stop hand-tuning prompts"

→ DSPy (Stanford)

Compiler approach: write modules, run optimizers (MIPRO, BootstrapFewShot), let the framework auto-tune prompts and weights.

"I need an agent for customer-facing chat with guardrails"

→ Parlant

Behavioral "guidelines" as first-class objects, not prompt magic. Built specifically for regulated/customer-support deployments.

"I want to drive a browser with AI"

→ browser-use or Stagehand

browser-use (Python, vision+DOM hybrid) is the fastest-growing repo. Stagehand (TypeScript, Playwright-augmented) has deterministic fallback for production.

"I need self-healing browser workflows (auth, 2FA, CAPTCHAs)"

→ Skyvern

YC-backed, vision-first, specifically engineered to handle the messy stuff that breaks naive Playwright scripts.

"I want an AI agent with its own sandboxed desktop"

→ Bytebot

Full virtual Linux desktop in Docker. The agent has its own OS, files, browser — none of yours. Best for untrusted task execution.

"I need to control Windows VMs with an agent"

→ Pig.dev

Most browser-use frameworks target Linux/web. Pig.dev is the rare Windows-first option for legacy enterprise apps.

"I want the official OpenAI primitives"

→ OpenAI Agents SDK

Production successor to Swarm. Handoffs, guardrails, tracing — the base layer many other libs now extend.

"I want AWS-backed multi-agent in Python"

→ Strands Agents

AWS open-sourced what they use internally for Bedrock services. v1.0 added A2A protocol; ~14M downloads by early 2026.

05 Sources

Every claim in the comparison and primer above is sourced from one of these. If a number drifts, file an issue and we'll re-check.

Primary Sources

The New Stack — "OpenClaw vs Hermes Agent: The race to build AI assistants that never forget" · primary source for the side-by-side comparison and historical timeline
NousResearch/hermes-agent (GitHub) · canonical Hermes repo, README, release notes
instructkr/claw-code (GitHub) · Claw Code main repo, Python+Rust clean-room rewrite
claw-code.codes · official Claw Code project page
Enderfga/openclaw-claude-code (GitHub) · sample fork showing OpenClaw plugin shape
utilo.io — "Hermes Agent vs Claude Code vs OpenClaw (2026)" · third-party comparison
MCPlato — "AI Agent Harness Deep Dive 2026" · cross-framework breakdown

Coding-Agent GitHub Repos

Aider-AI/aider · continuedev/continue · cline/cline · RooCodeInc/Roo-Code
All-Hands-AI/OpenHands · SWE-agent/SWE-agent · stackblitz/bolt.new
AntonOsika/gpt-engineer · smol-ai/developer · plandex-ai/plandex · trypear/pearai-app
block/goose · charmbracelet/crush · voideditor/void · vercel/v0-sdk

Orchestration GitHub Repos

langchain-ai/langchain · langchain-ai/langgraph · microsoft/autogen · crewAIInc/crewAI
huggingface/smolagents · pydantic/pydantic-ai · run-llama/llama_index · deepset-ai/haystack
agno-agi/agno · letta-ai/letta · VRSEN/agency-swarm · FoundationAgents/MetaGPT
OpenBMB/ChatDev · camel-ai/camel · strands-agents/sdk-python · stanfordnlp/dspy
PrefectHQ/marvin · BrainBlend-AI/atomic-agents · emcie-co/parlant · openai/openai-agents-python · kyegomez/swarms

Browser / Computer-Use GitHub Repos

Catalog last refreshed: May 25, 2026. Star counts and version numbers move quickly in this space; the GitHub link in each card is authoritative. Cards without verified star counts are listed based on web research — click through to the repo for current state. Spotted an error or missing framework? File an issue and we'll re-verify.