Can sub-agents spawn other sub-agents in Claude Code?

Yes, as of v2.1.172 (June 10, 2026). Sub-agents can now spawn their own sub-agents up to 5 levels deep. Before this version the Agent tool was disabled in sub-agent contexts to prevent infinite recursion, which is why older docs and tutorials still claim nesting is impossible.

How many sub-agents can Claude Code run at once?

There is no hard concurrency cap on the foreground tree, but each parallel branch carries its own context window and burns tokens independently. The official cost guide cites a 7× token multiplier for agent teams running in plan mode; nested sub-agents trend in the same direction whenever the tree fans out, because each branch maintains its own context window. Run more than 3-4 parallel branches and you'll feel it in the bill before you feel it in latency.

What model should I use for nested sub-agents?

Default the leaf nodes to Haiku via `model: haiku` in the sub-agent frontmatter. Reserve Sonnet for branches that need code modification, and Opus only for the top-level reasoner. The further from the root, the cheaper the model — that's the inverse of how teams usually configure them.

How do I prevent a runaway nested sub-agent in Claude Code?

Three guardrails: set `maxTurns` on every sub-agent definition (typical: 8-15), restrict `tools` to the minimum required (omit `Agent` entirely on leaf sub-agents so they can't nest further), and use `permissions.deny: ["Agent(name)"]` in `settings.json` to block specific child types globally. The `Agent(name1, name2)` allowlist syntax in `tools` only applies when the agent runs as the main thread via `claude --agent` — in nested sub-agent definitions any type list inside the parentheses is ignored.

Do nested sub-agents share context with the parent?

No. Each sub-agent runs in its own context window with a custom system prompt and receives only the task description from the parent. It returns a summary back to the parent. That isolation is the point — it's also why nesting multiplies token usage instead of sharing it.

Can a nested sub-agent use MCP servers the parent doesn't have?

Yes. The `mcpServers` field in a sub-agent's frontmatter can declare inline server definitions that connect only while that sub-agent runs. Useful when you want a Playwright or Postgres server available to a leaf-level sub-agent without bloating the main session's tool list.

What's the difference between nested sub-agents and agent teams in Claude Code?

Sub-agents (nested or not) run inside a single Claude Code session, share its top-level user prompt, and return results back to the parent. Agent teams are separate sessions that communicate with each other and survive past a single conversation. Nested sub-agents are the right call for divide-and-conquer inside one task; agent teams are for long-running collaboration across tasks.

Does ofox.ai work with Claude Code sub-agents?

Yes. Sub-agents inherit the session's `ANTHROPIC_BASE_URL`, so pointing the main process at `https://api.ofox.ai/anthropic` routes every nested sub-agent through the same gateway. You can also set `CLAUDE_CODE_SUBAGENT_MODEL` to a specific ofox model ID like `anthropic/claude-haiku-4.5` to override all sub-agent model selections at once.

Claude Code Nested Sub-Agents: 5 Levels Deep, Token Math, 3 Pitfalls (2026)

For most of 2026, “subagents cannot spawn other subagents” was a hard rule in the Claude Code docs — a deliberate guardrail against infinite recursion. That rule was lifted on June 10, when v2.1.172 shipped one line of changelog: sub-agents can now nest, up to 5 levels deep. The docs have since added a “Spawn nested subagents” section, but most existing tutorials and templates predate it.

If you’re already comfortable with single-level delegation, the upgrade looks small — but the token math, the configuration surface, and the failure modes are all different from what your habits expect. The most easily-missed gotcha: the Agent(name1, name2) allowlist syntax only restricts spawning when an agent runs as the main thread (claude --agent); inside a nested sub-agent definition the parenthesised list is silently ignored. This guide is what you needed to read before the first nested run you regret.

What You Can Do With Nested Sub-Agents (And What You Can’t)

You Can	You Can’t
Spawn sub-agents from within a sub-agent, up to 5 levels deep	Skip configuring `tools`, `model`, or `maxTurns` — the defaults inherit everything from the parent
Route each level to a different model (`opus` → `sonnet` → `haiku`)	Share context between nested sub-agents — each runs in an isolated context window
Prevent further nesting by omitting `Agent` from a sub-agent’s `tools` list	Use `Agent(name1, name2)` inside a sub-agent definition as an allowlist — the parens are ignored unless the agent runs as the main thread via `claude --agent`
Define inline `mcpServers` and `hooks` that only run during one sub-agent’s lifetime	Use `EnterPlanMode`, `AskUserQuestion`, `ScheduleWakeup`, or `WaitForMcpServers` inside any sub-agent (those tools require main-thread state)
Set `permissions.deny: ["Agent(name)"]` in `settings.json` to block specific sub-agent types globally	Spawn further sub-agents from a background sub-agent at depth five — that level loses the Agent tool by design

The 30-second mental model: nested sub-agents are recursion with a 5-frame stack limit, each frame carrying its own system prompt and model. The parent reads only the leaf’s summary. Everything in between costs tokens and disappears.

Decision Frame: When to Nest (and When NOT)

When to use nested sub-agents

The work is a tree, not a list — a top-level reviewer that needs to ask three specialised sub-reviewers (security, perf, style) the same question against the same diff, where each specialist has its own MCP server or skill preload.
The work is recursive in shape — exploring a monorepo where each packages/* directory needs its own scoped exploration with its own conventions, and the leaf summaries roll up.
You’re hitting context-window pressure at level 1 — single-level sub-agents are returning 30K-token reports that swamp the main conversation, and pushing one more layer down would let each leaf summarise before sending up.

When NOT to use nested sub-agents

The work is a sequence — five steps that depend on each other belong in a single sub-agent with maxTurns: 20, not five nested calls. Each layer adds an irreversible context split and a system-prompt write.
You want parallel workers that talk to each other — that’s agent teams, not nesting. Nested sub-agents communicate only up and down, never sideways.
The leaf would only call one tool — at that point you’re paying a full sub-agent system prompt for one Read or Grep. Just call the tool from the parent.

Stop rule

Before nesting, write the leaf summary you expect on a sticky note. If it’s under ~500 tokens or the parent could have produced it directly with two tool calls, don’t nest. Nesting earns its keep when the leaf does meaningful work and returns a meaningfully compressed answer.

System Requirements

Requirement	Verify with
Claude Code v2.1.172 or newer	`claude --version`
`Agent` tool present in the parent sub-agent’s `tools` (allowlist)	`cat .claude/agents/<parent>.md \| grep -i agent`
Anthropic API access OR an Anthropic-compatible gateway like ofox.ai	`echo $ANTHROPIC_BASE_URL`
At least one child sub-agent definition in `.claude/agents/` or `~/.claude/agents/`	`ls -1 .claude/agents/ ~/.claude/agents/ 2>/dev/null`

Upgrade with npm i -g @anthropic-ai/claude-code if you’re on 2.1.171 or below. The CLI prints the resolved version on launch; older versions silently ignore nesting calls and treat the inner Agent tool use as a regular Read.

Step-by-Step: Set Up a Nested Sub-Agent Chain

The example below builds a three-level chain: a top-level triage-lead that classifies an incoming bug report, delegates to a repro-runner to confirm reproducibility, which in turn delegates to a log-summariser to compress raw container logs into a 200-line summary.

Step 1: Define the leaf

Create .claude/agents/log-summariser.md:

---
name: log-summariser
description: Reads raw container logs and returns a structured summary of errors, timing, and suspicious patterns.
tools: Read, Grep
model: haiku
maxTurns: 8
---

You are a log summariser. Given a path to a log file, return:
1. Top 5 distinct error signatures with frequency
2. Earliest and latest timestamps observed
3. Any panic, OOM, or segfault entries with line numbers
Return strictly Markdown — no preamble.

Expected result: the file is picked up immediately if you create it via /agents, or on next session start if you edit it on disk.

Step 2: Define the mid-tier

Create .claude/agents/repro-runner.md:

---
name: repro-runner
description: Reproduces a bug from a description, captures logs, and returns whether the failure is deterministic.
tools: Agent, Read, Bash
model: sonnet
maxTurns: 12
---

You reproduce bugs. Run the failing command, capture logs to /tmp/repro.log,
then call the log-summariser sub-agent to compress the output before returning.
Report: deterministic? yes/no, plus the summary.

The Agent entry in tools is the new piece. Pre-2.1.172, the Agent tool was disabled in sub-agent contexts entirely; listing it had no effect. From 2.1.172 onward, listing Agent lets the sub-agent spawn nested children. Note that Agent(specific-name) inside a sub-agent definition is not a working allowlist — the docs are explicit that the parenthesised type list “is ignored” in this context. To actually restrict which children get spawned, either omit Agent from leaf-tier files (so they cannot nest at all) or use permissions.deny: ["Agent(name)"] in .claude/settings.json to block specific types globally.

Step 3: Define the root

Create .claude/agents/triage-lead.md:

---
name: triage-lead
description: Triages incoming bug reports end-to-end — classification, reproduction, log review.
tools: Agent(repro-runner), Read, Grep, Bash
model: opus
maxTurns: 15
---

Classify the bug (P0/P1/P2), delegate reproduction to repro-runner, then
return a triage memo with severity, repro status, and the summarised logs.

The tools: Agent(repro-runner) allowlist here only takes effect if triage-lead runs as the main thread (see Step 4). When triage-lead is invoked as a regular sub-agent from a normal session, the docs are explicit that the parenthesised list is ignored and the agent can spawn any registered sub-agent type. The chain is: main session → triage-lead (Opus) → repro-runner (Sonnet) → log-summariser (Haiku). Three levels.

Step 4: Invoke from the main session

There are two invocation patterns, and they behave differently:

# Pattern A — triage-lead runs as a SUB-AGENT (allowlists in its file are ignored)
> Use the triage-lead agent on the bug report at ./bugs/2026-06-11-checkout-500.md

# Pattern B — triage-lead runs as the MAIN THREAD (its Agent(...) allowlist is enforced)
claude --agent triage-lead
> Triage ./bugs/2026-06-11-checkout-500.md

Use Pattern B when you want the Agent(repro-runner) allowlist on triage-lead to be enforced. Pattern A is fine when you trust the sub-agent to delegate sensibly and you’d rather keep your normal session open. Each handoff appears as a separate panel below the prompt; the panel rows show each sub-agent with a (+N) descendant count and the path back to main.

Step 5: Verify the nesting actually happened

After the run completes, check /agents (the Running tab) or scroll the transcript for three distinct sub-agent panels. If you only see one, the parent treated the spawn as a regular tool invocation — almost always a version mismatch. Re-run claude --version and confirm 2.1.172+.

/agents → Running tab
> triage-lead         (+2)   opus     ✓ completed
> repro-runner        (+1)   sonnet   ✓ completed
> log-summariser              haiku    ✓ completed

The (+N) next to each row is the descendant count. Three rows with the leaf reporting no descendants is the floor for “this worked.” The 5-level cap is a stack limit, not a target — most useful chains live at depth 2-3.

Common Errors When Nesting Sub-Agents (and Fixes)

Symptom	Likely cause	Quick fix
`Agent` tool absent from sub-agent’s available tools	Pre-2.1.172 behavior, or `Agent` omitted from the parent’s `tools` field	Upgrade Claude Code, then add `Agent` (no parentheses needed inside a sub-agent definition) to the parent’s `tools`
Sub-agent spawns child but child has no MCP servers the parent had	MCP servers are not inherited through nesting — each sub-agent reconnects per its own frontmatter	Add the server name to the child’s `mcpServers` field, or hoist to user-level `settings.json`
`Agent(name)` allowlist in a sub-agent file not enforcing	Documented behaviour: the parens are ignored inside any sub-agent definition; the allowlist only works when the agent runs as the main thread via `claude --agent`	For nested calls, move the restriction to `permissions.deny: ["Agent(name)"]` in `.claude/settings.json`, or omit `Agent` from the sub-agent’s `tools` to block nesting entirely
`permissionMode` / `hooks` / `mcpServers` silently stripped from a sub-agent	Agent file was loaded from a plugin’s `agents/` directory; plugin security policy removes these fields	Copy the agent file into `.claude/agents/` or `~/.claude/agents/` instead of leaving it in the plugin
Nested sub-agent reaches depth 6 and stops with no clear error	5-level recursion cap reached; the 6th-level call fails silently in some transcripts	Refactor: either flatten one level or split the work across two top-level invocations
Parent reports “completed” but child agent stays in “active” state	Bug fixed in 2.1.172 for the most common variant; can still appear with background sub-agents	Update to 2.1.172+, or kill the child via `/agents` Running tab and re-run
Token bill spikes 7-12× without a visible behaviour change	Each nested level adds a system-prompt write; default model inherits the main session’s (Opus for many users)	Set `CLAUDE_CODE_SUBAGENT_MODEL=haiku` to force Haiku on every sub-agent that doesn’t override, then promote per-agent

The “parent reports completed / child stays active” row is the one that bites teams resuming older sessions. The v2.1.172 changelog explicitly notes the fix for background sub-agents stuck as active after a nested child was stopped — if you’re resuming a session that was started on an older version, start fresh once to clear any inherited state.

Token Math: Why Nesting Costs Add Up Per Branch

This is the calculation that makes or breaks the technique. A single-thread Claude Code session reuses one cached system prompt across turns. A sub-agent runs in its own context window with its own system prompt (smaller than the main session’s, but not free). Nest that, and you pay the same overhead at each depth.

Depth	What gets written	What’s reusable	Approx. token overhead per call
0 (main)	Main system prompt + tools + CLAUDE.md	Yes (cached across turns)	Baseline
1 (sub-agent)	Sub-agent system prompt + tool list + task description	Partial — tool list cached within branch	+30-60% of baseline
2 (nested)	Inner sub-agent prompt + tool list + task description from parent’s summary	Lower — each child writes a fresh prefix	+30-60% on top of depth 1
3-5 (deep nesting)	Same shape, cumulative	Vanishing — each leaf’s prefix is short-lived	Cumulative several× single-thread

The official cost-management guide cites a ~7× token multiplier for agent teams running in plan mode — separate Claude Code instances communicating with each other. Nested sub-agents are not the same primitive, but they trend in a similar direction whenever the tree is wide, because each branch maintains its own context window and writes a fresh sub-agent prompt prefix. Leaf calls usually do less work and use cheaper models, which keeps the multiplier finite — but every fresh fan-out at every depth contributes.

Two levers fight that math:

Model tiering by depth. The leaf rarely needs Opus. A typical breakdown for a triage chain: Opus at root, Sonnet at level 1, Haiku from level 2 onward. Pricing tiers between Opus and Haiku run roughly an order of magnitude apart on input tokens, so the savings on the leaf tier compound across every nested call.
maxTurns per layer. Without it, a leaf can churn for 30+ turns trying to “be helpful.” Cap leaves at 8 turns and mid-tiers at 12; mostly you’ll never hit the cap, and when you do it’s a signal that the task was wrong.

For the deeper cost model on Claude Code session economics, see the Claude Code token optimization guide. Its cache_read_input_tokens analysis applies per sub-agent branch, not just to the main thread.

3 Anti-Patterns That Blow Your Budget

Anti-Pattern 1: Runaway nesting via general-purpose recursion

The path of least resistance is to give every sub-agent tools: Agent and let the model figure out who to spawn. Within hours you’ll see a transcript where the triage agent spawned a “general researcher” that spawned an “investigation specialist” that spawned a “log reader” that spawned… a general researcher again. Each layer adds tokens. The depth cap of 5 saves you from infinite loops but does nothing to save you from the bill.

Fix: Inside sub-agent definitions, the Agent(name1, name2) parenthesised allowlist is silently ignored, so you can’t gate spawning at the file level the way you can with a main-thread agent. Two mechanisms actually work:

Omit Agent from leaf-tier files. A leaf-tier tools: Read, Grep (no Agent) physically cannot nest further. Use this as the default and only re-add Agent on mid-tier files that genuinely need to delegate.
Block specific children globally with permissions.deny: ["Agent(general-purpose)"] in .claude/settings.json. This enforces the restriction across every sub-agent in the project regardless of what the individual frontmatter says.

Anti-Pattern 2: Opus everywhere

Defaulting every sub-agent to the main conversation’s model is the default, and it’s exactly wrong for a multi-level tree. Opus at the root is justified (you’re doing the synthesis). Opus at a leaf that’s reading a log file and counting error frequencies is paying the top-tier rate for work that Haiku handles at a fraction of the cost.

Fix: Set CLAUDE_CODE_SUBAGENT_MODEL=haiku at the shell level so every sub-agent that doesn’t override gets Haiku. Then promote only the ones that need more — typically just root and depth-1. The Claude Code hybrid routing pattern walks through how to think about per-task model selection.

Anti-Pattern 3: Treating `Agent(name)` in sub-agent files as a real allowlist

Project sub-agent files are checked into version control, and it’s tempting to write tools: Agent(repro-runner, log-summariser), Read, Grep, Bash to “lock down” which children that sub-agent can spawn. The docs are explicit that this does not work: inside a sub-agent definition, “any type list inside the parentheses is ignored.” The file reads as if it’s restrictive, but at runtime the sub-agent can still spawn any registered type. The allowlist only enforces when the agent runs as the main thread via claude --agent.

Fix: For real per-project restrictions, encode them in .claude/settings.json:

{
  "permissions": {
    "deny": ["Agent(general-purpose)", "Agent(migration-runner)"]
  }
}

That denylist is honoured uniformly. In the agent file itself, either leave Agent off the tools line (no nesting) or list it without parens (nesting allowed, governed by global settings). Treat any change to permissions.deny as a security review, the same way you’d treat changing permissions.allow. The Claude Code safety guide covers the broader principle: every permission-adjacent field is a security boundary.

Team / Multi-Developer Configuration

For teams running shared Claude Code setups across many repositories, three conventions keep nested sub-agents cheap, predictable, and reviewable:

Convention	Where to set it	Why it matters
`CLAUDE_CODE_SUBAGENT_MODEL=haiku` in shared shell rc	Per-developer dotfiles	Eliminates the “Opus everywhere” leak by default; teams that need Sonnet/Opus override per file
Leaf-tier sub-agents committed under `.claude/agents/` with `Agent` omitted from `tools`	Per-repo	Prevents leaf-tier files from spawning further sub-agents at all; code review surfaces any addition of `Agent` to a leaf file
`permissions.deny: ["Agent(general-purpose)"]` for sub-agent-heavy projects	`.claude/settings.json`	Forces specialised delegation; prevents the recursion anti-pattern at the only layer where allowlist enforcement actually works for nested calls

For organisations standardising across many repos, also consider a dotfiles-managed ~/.claude/agents/ directory containing the cross-project agents (review, security, doc-summariser). User-level definitions win against project-level only when project doesn’t redeclare the same name, so the convention is “shared in user scope, repo-specific in project scope.”

Advanced: Routing Different Levels to Different Models via ofox

The CLAUDE_CODE_SUBAGENT_MODEL environment variable applies to every sub-agent that doesn’t set its own model. When pointed at an Anthropic-compatible gateway, that one variable can swap your entire leaf tier from one model alias to another without touching any sub-agent file. The integration is one environment variable plus a model alias:

export ANTHROPIC_BASE_URL=https://api.ofox.ai/anthropic
export ANTHROPIC_API_KEY=<your_ofox_key>
export CLAUDE_CODE_SUBAGENT_MODEL=haiku

claude

Sub-agents at any depth that don’t explicitly override the model field now route through ofox to Haiku. The root agent and any sub-agent that explicitly sets model: opus or model: sonnet still routes through ofox to the corresponding tier — CLAUDE_CODE_SUBAGENT_MODEL is a default, not an override. The Claude Code ofox configuration guide covers the full setup, including how cache_control headers pass through unchanged for prompt caching savings on the parent.

ofox accepts both the bare Claude Code aliases (opus, sonnet, haiku) and the explicit provider-prefixed IDs (anthropic/claude-opus-4.8, anthropic/claude-sonnet-4.6, anthropic/claude-haiku-4.5) that you’d reference in a direct integration. Aliases keep your sub-agent frontmatter portable across direct-Anthropic and gateway routing; explicit IDs pin you to a specific version. The model marketplace at ofox.ai/models lists the current Claude line-up so you can size the tiered tree against actual bill impact before committing.

For very long nested runs, the 5-minute default prompt-cache TTL becomes the next constraint. Setting ENABLE_PROMPT_CACHING_1H=1 extends it to one hour at 2× the cache-write cost — typically worth it once a single triage chain crosses 15 minutes of wall-clock time.

References

Claude Code CHANGELOG v2.1.172: github.com/anthropics/claude-code/blob/main/CHANGELOG.md
Claude Code sub-agents reference: code.claude.com/docs/en/sub-agents
Claude Code cost management guide: code.claude.com/docs/en/costs
ofox model gallery snapshot: ofox.ai/models

The headline is “5 levels deep,” but the useful number is “depth 2.” Most nested chains earn their keep at root-Opus, mid-Sonnet, leaf-Haiku — three layers, three model tiers. Restrict nesting where it actually works: omit Agent from leaf-tier tools, and pin a permissions.deny: ["Agent(...)"] list in settings.json. Skip the tiering and the restrictions, and the 5-level cap becomes the wallet’s last line of defence rather than the first.