Hugging Face

Smolagents

Name: Smolagents
Author: Hugging Face

Minimal code-writing agents from Hugging Face — the smallest agent framework that works.

Minimal code-writing agents

Visit Site View on GitHub Read the Docs

GitHub Stars

27.4K

Contributors

203

npm / Week

—

PyPI / Month

582.8K

Maintained by: Hugging Face
First released: Dec 2024
Last commit: 3 days ago
Language: Python
License: apache-2

Overview

Smolagents is a minimal agent runtime from Hugging Face that deliberately does less than its peers. Released in 2024 under the Apache 2.0 license, it’s a Python library built around a core insight: agents that write Python code as their action format are more expressive than agents constrained to JSON tool calls. The framework’s tagline — “the smallest agent framework that works” — isn’t marketing hype. The core agent logic fits in roughly 1,000 lines of documented code, which you can read end to end in an afternoon.

This design philosophy comes directly from the original ReAct paper. Where most frameworks layer abstractions on top of abstractions (chains, graphs, state machines, middleware), Smolagents keeps the abstraction surface flat. You get an Agent class, a Tool class, and a model interface. That’s it. The result is a framework that’s ideal for teams who want to understand exactly what their agent is doing, modify behavior without fighting framework internals, and run agents that can handle arbitrary Python logic.

The numbers back up its traction: 27,349 GitHub stars, 203 contributors, and 582,776 monthly PyPI downloads at the time of writing. That’s a strong signal for a framework that’s still under two years old. It’s maintained by the same Hugging Face team that delivers transformers, datasets, and the Hub — a team that knows how to build and maintain open source at scale.

Smolagents competes most directly with LangChain’s code-execution agents and CrewAI’s agent workflows. But its philosophy is closer to the simplicity of Pydantic AI or the original Autogen concept: a thin layer over the model that gives the model agency to act, rather than a sprawling middleware platform.

Architecture and Programming Model

Smolagents is imperative and code-first. You don’t define a graph, a chain, or a configuration file. You write Python.

At the core are three abstractions:

`CodeAgent`: the primary agent class. It maintains a conversation loop: the model receives messages (system prompt, tool definitions, user request, previous outputs), generates a plan, and writes Python code as its action. The agent executes that code in a sandbox, captures stdout/stderr, and feeds results back to the model. This loop continues until the user’s goal is met or a stop condition triggers.

Tool: a callable object with a name, description, inputs, and outputs. Tools are functions wrapped with metadata that the agent can discover and call via generated Python. You can define tools as simple decorated functions or import pre-built tools from the Hugging Face Hub.

Model: a wrapper around any LLM. Smolagents supports local transformers models, Hugging Face inference endpoints, OpenAI, Anthropic, and any provider via LiteLLM. The model handles tokenization, generation, and parsing of the agent’s action (code) back into the loop.

The control flow is straightforward:

User provides a task and a set of tools.
Agent sends the initial prompt to the LLM.
LLM generates a plan and writes Python code (using imports for tools, calling them as functions).
CodeAgent executes the code in a sandbox (E2B, Docker, Pyodide, or local).
Execution results are appended to the message history.
Loop repeats until the task is complete.

There is no built-in state machine, no branching, no subgraph composition. If you want multi-step planning with conditional logic, you express it in the Python code the agent writes. This gives you maximum flexibility at the cost of some guardrails.

For multi-agent scenarios, you can create multiple CodeAgent instances, each with different tools or models, and coordinate them via message passing in your own orchestration code. Smolagents provides no built-in multi-agent protocol; that’s a deliberate choice to keep the core small.

Key Features and Capabilities

Code-writing agents. Smolagents’ defining capability is that agents output Python code, not JSON tool calls. This matters because code is far more expressive. An agent can import libraries, loop over collections, conditionalize behavior, chain operations, and handle edge cases inline. For tasks like math reasoning, data analysis, search, or file manipulation, code agents outperform JSON-call agents on many benchmarks. The agent can write from math import sqrt; result = sqrt(42) instead of figuring out how to nest square root inside a JSON tool call structure.

Tiny core. The agent logic lives in agents.py — roughly 1,000 lines with inline comments. This is not just a design talking point; it has real implications. You can debug by reading the source. You can customize behavior by forking the file. You can understand the prompt templates, the action parsing, and the loop invariants in an afternoon. For teams evaluating frameworks, this transparency reduces risk.

Sandboxed execution. Code execution is inherently dangerous when the agent runs arbitrary Python. Smolagents provides multiple sandbox backends: E2B, Docker, Blaxel, Modal, and a Pyodide/Deno WebAssembly sandbox. You can choose the level of isolation that fits your risk profile. For trusted inputs (e.g., internal research assistants), local execution may suffice. For production systems with user-provided prompts, use a sandbox backend like E2B that runs each agent session in a fresh environment.

Hub integration. Tools and agents can be pushed to or pulled from the Hugging Face Hub. This enables sharing of curated tools across teams or the open source community. It also means you can start building with a tool library that already includes search, web scraping, file I/O, and more.

Model-agnostic. Smolagents works with any LLM that can be called via a Python API. It ships with support for transformers (local), Hugging Face inference providers, OpenAI, Anthropic, and LiteLLM. You are not locked into Hugging Face’s model ecosystem, though it integrates naturally if you are already using transformers.

Multi-agent capability. While not a first-class abstraction, you can compose multiple agents by running them in parallel or sequence from your own code. This is enough for most multi-agent patterns (research and summarization, iterative refinement, tool dispatching) without the overhead of a dedicated orchestration layer.

Real-World Use Cases

Research agents. This is where Smolagents shines. A researcher wants an agent that can fetch documents from the web, process them with pandas or nltk, compute statistics, and produce a summary. A code agent can import those libraries directly, chain operations, and handle errors gracefully — all in one Python script generated automatically. Frameworks that only support discrete JSON tool calls would require splitting that workflow into many individual tool invocations, which is slower and more brittle.

Teaching and exploration. Smolagents is frequently used in workshops and courses to teach agent fundamentals. Because the core code is small and readable, students can trace how the model’s output becomes executed code. It’s a live demonstration of the ReAct pattern. This use case drives much of the framework’s popularity on GitHub.

Hugging Face stack integration. If you already deploy models via transformers or use the Hub for model hosting, Smolagents slots in with zero additional infrastructure. You can run a CodeAgent with a local Mistral or Llama model, using the same tokenizer and generation configuration you already have.

Internal copilots and coding assistants. Teams building internal tools that need to automate data processing, report generation, or system administration find code agents effective. The agent can call shell commands, manipulate files, query databases, and produce output — all in one run.

Poor fit: Smolagents is not a platform for large-scale customer-facing chatbots that require high availability, observability, and load balancing. The framework intentionally lacks production tooling: no built-in tracing, no queue management, no rate limiting. You bring your own observability. If your primary need is a robust conversational system with fallback flows and latency SLAs, look elsewhere.

Getting Started With Smolagents

Install:

1pip install smolagents

The smallest meaningful example (20 lines or less):

1from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
2
3agent = CodeAgent(
4    tools=[DuckDuckGoSearchTool()],
5    model=HfApiModel()  # uses your HF token or default inference endpoint
6)
7
8agent.run("What is the current population of Tokyo, and how does it compare to 10 years ago?")

This agent will: receive the question, generate a plan, write Python code that calls the search tool, parse the results, compute the answer, and return it. No configuration files, no YAML, no extra setup.

What you need around it:

An LLM provider. The simplest path is a Hugging Face Inference API token (free tier available). Alternatively, set up a local model with transformers or use OpenAI/Anthropic via LiteLLM.
For production with untrusted inputs, set up a sandbox backend (E2B, Docker). The E2B sandbox requires an API key but removes the risk of arbitrary code execution.
Observability: add your own logging, tracing (OpenTelemetry), or error handling. Smolagents does not ship with a dashboard or monitoring.

Docs and community: The official documentation is at [huggingface.co/docs/smolagents](https://huggingface.co/docs/smolagents). The GitHub repository at huggingface/smolagents has extensive examples under examples/ and a dedicated AGENTS.md guide. The Hugging Face Discord and forum are active for troubleshooting.

How It Compares

Smolagents vs. LangChain. LangChain is a battle-tested framework with rich production tooling: model providers, memory, callbacks, tracing with LangSmith, and a large ecosystem of integrations. However, it is also heavy. The abstraction layers (chains, agents, tools, retrievers, callbacks) require significant learning. Smolagents is the antipode: minimal abstraction, maximum readability. Choose Smolagents when you want to understand what your agent is doing and you value simplicity over ecosystem breadth. Choose LangChain when you need out-of-the-box observability, versioned chains, or enterprise integrations.

Smolagents vs. CrewAI. CrewAI focuses on multi-agent orchestration with roles, processes, and task delegation. It is good for scenarios where different agents have distinct personas and you want hierarchical workflows. Smolagents does not provide role abstractions. If you need a “manager agent” that delegates to “worker agents” with explicit task lists, CrewAI is a better choice. If you simply want a single powerful code agent that can call multiple tools and reason step by step, Smolagents is cleaner and faster to set up.

Smolagents vs. Pydantic AI. Pydantic AI shares Smolagents’ philosophy of minimalism and Python-first design. Both produce agents that are easy to reason about and customize. The main difference is execution format: Pydantic AI uses structured tool calls (JSON), while Smolagents uses generated Python. For tasks where arbitrary computation is required, Smolagents’ code agents are more capable. For tasks that must strictly adhere to a predefined API surface, Pydantic AI’s type-safety is an advantage.

The tradeoff is clear: Smolagents trades production polish for transparency and power. It is a framework you build on top of, not inside of. If your team values understanding every line of the runtime and wants an agent that can write arbitrary Python, it is the right choice. If you need a turnkey production platform, you will need to supplement it with your own infrastructure.

Strengths

Small enough to read end-to-end — under 1,000 lines of core code.
Code agents are more capable than JSON tool-call agents on many tasks.
Backed by the Hugging Face transformers ecosystem.

Trade-offs

Less production tooling than the alternatives — bring your own observability.
Code execution needs sandboxing for untrusted inputs.

Key Features

What the framework gives you out of the box, in plain language.

Multi-Agent
Streaming
Tool Use
Human in the Loop
Memory
Tracing
Evaluations
Self-Hostable
Cloud-Hosted
Type-Safe

Code-writing agents
Agents produce Python code as their action — more expressive than JSON tool calls.
Tiny core
A few hundred lines of well-commented code you can actually read and modify.
Sandboxed execution
Multiple sandbox backends, including E2B and Docker, for safe code execution.

Where It Shines

The jobs this framework is best suited for.

Research agents
Agents that can chain Python operations naturally — math, search, file manipulation.
Teaching and exploration
A readable codebase that's ideal for learning how agent frameworks work.
Hugging Face stack
Native fit when you're already running open-weight models via transformers.

Side-by-Side

Compare Smolagents With Another Framework

Add a second or third framework and see stars, downloads, and capabilities lined up next to each other.

Open the Comparator

Related Frameworks

Close alternatives worth a look before you decide.

LangChain

Composable building blocks for LLM apps — chains, agents, retrievers, and integrations.

Composable LLM building blocks

Stars

137.0K

npm / wk

2.2M

PyPI / mo

241.8M

MixedMITLast commit:Today

CrewAI

Multi-agent crews with role-based prompts and explicit task hand-offs.

Role-based multi-agent crews

Stars

51.6K

npm / wk

—

PyPI / mo

9.6M

PythonMITLast commit:2 days ago

Pydantic AI

Type-safe agents with structured outputs from the Pydantic team.

Type-safe Python agents

Stars

17.1K

npm / wk

—

PyPI / mo

39.1M

PythonMITLast commit:2 days ago

Frequently Asked Questions

What Is an Agent Framework?

An agent framework is the code your team uses to wire large language models into tools, memory, and human checkpoints. It is the connective tissue between an LLM call and a real task, like answering a support ticket or running a multi-step research workflow.

Is Smolagents open source?

Smolagents ships under the apache-2 license. The source code lives on GitHub, so you can read it, fork it, and run it on your own infrastructure if your team prefers self-hosting.

Which language is Smolagents built in?

Smolagents is primarily a Python project. Pick a framework that matches the language your team already ships in. The cost of a stack switch is almost always higher than the difference between two frameworks.

Need Help Adopting Smolagents?

We help teams stand up production agents with the right framework for their stack, on a money-back basis if we cannot show ROI.

Hugging Face

Smolagents

Minimal code-writing agents from Hugging Face — the smallest agent framework that works.

Minimal code-writing agents

Visit Site View on GitHub Read the Docs

GitHub Stars

27.4K

Contributors

203

npm / Week

—

PyPI / Month

582.8K

Maintained by: Hugging Face
First released: Dec 2024
Last commit: 3 days ago
Language: Python
License: apache-2

Overview

Architecture and Programming Model

Smolagents is imperative and code-first. You don’t define a graph, a chain, or a configuration file. You write Python.

At the core are three abstractions:

`CodeAgent`: the primary agent class. It maintains a conversation loop: the model receives messages (system prompt, tool definitions, user request, previous outputs), generates a plan, and writes Python code as its action. The agent executes that code in a sandbox, captures stdout/stderr, and feeds results back to the model. This loop continues until the user’s goal is met or a stop condition triggers.

Tool: a callable object with a name, description, inputs, and outputs. Tools are functions wrapped with metadata that the agent can discover and call via generated Python. You can define tools as simple decorated functions or import pre-built tools from the Hugging Face Hub.

Model: a wrapper around any LLM. Smolagents supports local transformers models, Hugging Face inference endpoints, OpenAI, Anthropic, and any provider via LiteLLM. The model handles tokenization, generation, and parsing of the agent’s action (code) back into the loop.

The control flow is straightforward:

User provides a task and a set of tools.
Agent sends the initial prompt to the LLM.
LLM generates a plan and writes Python code (using imports for tools, calling them as functions).
CodeAgent executes the code in a sandbox (E2B, Docker, Pyodide, or local).
Execution results are appended to the message history.
Loop repeats until the task is complete.

Key Features and Capabilities

Real-World Use Cases

Getting Started With Smolagents

Install:

1pip install smolagents

The smallest meaningful example (20 lines or less):

1from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel
2
3agent = CodeAgent(
4    tools=[DuckDuckGoSearchTool()],
5    model=HfApiModel()  # uses your HF token or default inference endpoint
6)
7
8agent.run("What is the current population of Tokyo, and how does it compare to 10 years ago?")

What you need around it:

An LLM provider. The simplest path is a Hugging Face Inference API token (free tier available). Alternatively, set up a local model with transformers or use OpenAI/Anthropic via LiteLLM.
For production with untrusted inputs, set up a sandbox backend (E2B, Docker). The E2B sandbox requires an API key but removes the risk of arbitrary code execution.
Observability: add your own logging, tracing (OpenTelemetry), or error handling. Smolagents does not ship with a dashboard or monitoring.

How It Compares

Strengths

Small enough to read end-to-end — under 1,000 lines of core code.
Code agents are more capable than JSON tool-call agents on many tasks.
Backed by the Hugging Face transformers ecosystem.

Trade-offs

Less production tooling than the alternatives — bring your own observability.
Code execution needs sandboxing for untrusted inputs.

Key Features

What the framework gives you out of the box, in plain language.

Multi-Agent
Streaming
Tool Use
Human in the Loop
Memory
Tracing
Evaluations
Self-Hostable
Cloud-Hosted
Type-Safe

Code-writing agents
Agents produce Python code as their action — more expressive than JSON tool calls.
Tiny core
A few hundred lines of well-commented code you can actually read and modify.
Sandboxed execution
Multiple sandbox backends, including E2B and Docker, for safe code execution.

Where It Shines

The jobs this framework is best suited for.

Research agents
Agents that can chain Python operations naturally — math, search, file manipulation.
Teaching and exploration
A readable codebase that's ideal for learning how agent frameworks work.
Hugging Face stack
Native fit when you're already running open-weight models via transformers.

Side-by-Side

Compare Smolagents With Another Framework

Add a second or third framework and see stars, downloads, and capabilities lined up next to each other.

Open the Comparator

Related Frameworks

Close alternatives worth a look before you decide.

LangChain

Composable building blocks for LLM apps — chains, agents, retrievers, and integrations.

Composable LLM building blocks

Stars

137.0K

npm / wk

2.2M

PyPI / mo

241.8M

MixedMITLast commit:Today

CrewAI

Multi-agent crews with role-based prompts and explicit task hand-offs.

Role-based multi-agent crews

Stars

51.6K

npm / wk

—

PyPI / mo

9.6M

PythonMITLast commit:2 days ago

Pydantic AI

Type-safe agents with structured outputs from the Pydantic team.

Type-safe Python agents

Stars

17.1K

npm / wk

—

PyPI / mo

39.1M

PythonMITLast commit:2 days ago

Frequently Asked Questions

What Is an Agent Framework?

Is Smolagents open source?

Smolagents ships under the apache-2 license. The source code lives on GitHub, so you can read it, fork it, and run it on your own infrastructure if your team prefers self-hosting.

Which language is Smolagents built in?

Need Help Adopting Smolagents?

We help teams stand up production agents with the right framework for their stack, on a money-back basis if we cannot show ROI.