TL;DR: A dynamic workflow in Claude Code is a JavaScript script that orchestrates subagents. Claude writes the script for your task, the runtime executes it with up to 1,000 subagents (16 concurrent), and Claude only sees the final cross-checked answer. The interesting bit isn’t the agent count. It’s that the plan has moved out of the context window and into code you can read, diff, and re-run.
If you’ve used Claude Code subagents before, the mental model has been pretty consistent: Claude is the orchestrator, it spawns helpers turn by turn, each helper’s result comes back into Claude’s context, Claude decides what to do next. That loop works fine for a handful of helpers. It stops working when the task needs a hundred.
Dynamic workflows, which Anthropic shipped on May 28, 2026 alongside Claude Opus 4.8, reorganize that loop. Claude doesn’t run the orchestration any more. Claude writes a JavaScript script that runs the orchestration, and the Claude Code runtime executes that script for you. Reading the official docs, the shift is laid out almost in passing in the comparison table (“Who decides what runs next: the script” instead of “Claude, turn by turn”), but I think it’s the most interesting architectural decision in the release, and worth pulling apart.
What the runtime actually executes
A workflow is a script, not a prompt. Concretely: when you trigger one (by saying ultracode in your prompt, by running the bundled /deep-research, or by asking for “a workflow”), Claude generates a JavaScript file for the task. That file holds the orchestration logic: which subagents to spawn, in what order, how to fan work out and gather it back, when to branch on a result, when to retry. The runtime then runs that file in the background.
The script can’t touch the filesystem or the shell itself. Those go through the subagents it spawns; the script is the coordinator. That separation is what makes the agent caps enforceable: the runtime knows exactly how many agents are alive, how many are queued, and whether either is about to exceed its limit.
Two specific limits, both from the docs:
| Limit | Value | Why it’s set there |
|---|---|---|
| Concurrent agents | 16 (fewer on low-core machines) | Bounds local resource use |
| Total agents per run | 1,000 | Prevents runaway loops |
The 16 is per-machine ergonomics, not a deep claim about parallel-ability. The 1,000 is the more interesting number. It’s a hard cap that says, even on a runaway recursion or a poorly-written loop, the script can’t keep spawning agents forever. That’s the kind of guardrail you can only add when the orchestrator is a piece of code, not a model’s turn-by-turn choices.
Why the context-window decision matters
When Claude was the orchestrator, every intermediate result had to land in Claude’s context. A research agent returns five citations? Those go in the context. A code-review agent returns 200 lines of feedback? Also in the context. Run twenty of these and the context window stops being a place to think and starts being a logbook.
A workflow puts intermediate results in script variables instead. They’re regular JavaScript objects, sitting in the runtime’s memory, that the script can filter, deduplicate, vote on, and discard at will. Claude only sees what the script eventually returns, which is meant to be the final answer — already cross-checked.
The docs are explicit about this:
A workflow script holds the loop, the branching, and the intermediate results itself, so Claude’s context holds only the final answer.
I think this is doing more than it looks. The context window has been the limiting reagent in basically every serious agentic system write-up. Not the model’s intelligence, not the tool count, but how much working memory you can spend on a task before the signal starts decaying. Pushing intermediate state out of it is the kind of move that should have been obvious in retrospect, and probably will be once a few more frameworks copy it.
Adversarial verification, made repeatable
There’s a small phrase in the docs that I think is the actual product:
Moving the plan into code also lets a workflow apply a repeatable quality pattern, not just run more agents: it can have independent agents adversarially review each other’s findings before they’re reported.
In a turn-by-turn agent loop, “have two independent reviewers cross-check this and only surface claims they both accept” is a very awkward thing to encode. You’d have to babysit it from the conversation, hold the candidate findings in your context, spin up two more subagents, compare their outputs, and remember to drop the rejects. People do this; it’s annoying.
In a script, it’s a function. You collect findings, fan them to N independent reviewers (each with its own context, none of which sees the others’ opinions), tally the votes, return only the survivors. The bundled /deep-research workflow does exactly this. It fans web searches across angles, fetches sources, and “votes on each claim” so that “claims that didn’t survive cross-checking” are dropped before the report lands.
This is the part that justifies the term “dynamic workflow” over “more subagents.” A simple agent fan-out doesn’t get you cross-examination. The script form does.
The Bun rewrite, a concrete shape for the scale
The case study Anthropic likes to point at is the Bun runtime port from Zig to Rust. Bun was acquired by Anthropic earlier this year, and on May 14, 2026 Jarred Sumner merged PR #30412: the entire Zig codebase, ported to Rust, with 6,755 commits across 2,188 files, just over a million lines of Rust added, and a 99.8% test pass rate on Linux x64 glibc. The PR opened on May 8 and merged on the 14th, six calendar days.
Jarred confirmed on X that “dynamic workflows and adversarial code review was part of what made it possible to rewrite Bun in Rust in 6 days.” Internally, before the public release. The structure the Anthropic team has described is roughly:
- One workflow walked the Zig codebase and mapped the right Rust lifetime for each struct field.
- Another wrote each
.rsfile as a behavior-identical port of its.zigcounterpart, with hundreds of agents in parallel and two reviewers on each file. - A fix-loop workflow then drove the Rust build and test suite to convergence.
- An overnight pass picked up unnecessary data copies and opened PRs for each.
That’s still a startling amount of code in a short window, and reasonable people are skeptical: the merged tree has over 13,000 unsafe blocks, compared with about 73 in uv, a similarly-sized hand-written Rust project. The Rust build is canary-only; v1.3.14 was the last Zig release. The port isn’t in production yet, and the skeptics may end up right that this kind of mass-translation produces a structurally less safe codebase. I want to flag that honestly rather than wave it past.
But the part that’s hard to argue with: this happened at all. A six-day port of a million-line codebase across two languages is the kind of thing that, a year ago, you wouldn’t have proposed without being asked to leave the room. Whatever the long-tail issues, dynamic workflows demonstrably ran the orchestration that made it tractable, and the script-form is what let it scale past where a single agent’s context window would have collapsed.
When to use one (and when not to)
The docs include a comparison table that’s worth reading directly, but the short version: reach for a workflow when the task needs more agents than one conversation can coordinate, or when you want the orchestration codified as a script you can read and re-run. The second case is the under-told one. A code review you run on every branch becomes a saved workflow; next time you run it, the same script runs the same orchestration. The reproducibility is the value, not just the scale.
Where it doesn’t fit: anything that needs human input mid-run (workflows can’t pause for that), anything where the script itself needs filesystem access (the script coordinates agents, agents do the work), and anything small enough that the overhead of spinning up a workflow runtime is silly. If you can do it in three subagent calls, do it in three subagent calls.
A pragmatic note from the docs that I appreciate: the run cost can balloon, so the suggestion is to run a workflow on a small slice first (one directory instead of the repo, a narrow question instead of the broad one), watch the per-agent token usage in /workflows, and stop the run there if it’s running away. The 1,000-agent cap bounds the upper edge of disaster, but the bill before that cap can still be real.
What I think this changes
I don’t want to oversell a research preview. It might turn out that the script-as-orchestrator shape works well for codebase audits and migrations and badly for everything else. The 13,000 unsafe blocks in Bun is a real signal that we don’t yet know what mass-LLM-written code looks like under the load of production. Adversarial verification at the agent level is a good move and almost certainly not enough on its own.
But the framing is the thing that’s stuck with me. In a workflow, Claude isn’t agentic in the loop. Claude is one step removed: it writes the program that drives the agents. The relationship between “model” and “process” rotates, and the model starts looking like a compiler for orchestration plans instead of an executor of them. That feels like a more durable architectural idea than the 1,000-agent number, and I expect it to show up in other agent frameworks within the year.
Read next
- Anatomy of a 13-line skill — how a tiny skill file actually executes inside Claude Code
- Skill design as interface design — the contract between Claude and a skill, and how it differs from a prompt
- No evidence, no completion — why a confident agent report isn’t the same as confirmed work, and how verification fits at the agent boundary
- Prior art: what distributed systems already knows — coordination patterns from systems literature that apply directly to multi-agent runs
- Claude Code 多了個 dynamic workflows,我打開那段 JS 看了一下 — Chinese companion piece, different angle on the same release



