Orchestration patterns
Single-agent, multi-agent, workflows, subagents.
Start with the smallest shape that works
Before reaching for a multi-agent architecture, map your problem onto the simplest orchestration shape it fits. The simplest of all is a single LLM call — prompt in, answer out.
It has no tools, no loop, no memory. For anything that needs current data or side effects you will outgrow it quickly, but when it fits, nothing is cheaper to evaluate or operate.
One agent is the common case. Start there.
There is a temptation to reach for multi-agent architectures early — a router agent, a researcher agent, a writer agent, a reviewer agent... Most of the time, a single agent with a good system prompt, a small toolkit, and thoughtful skills will outperform a multi-agent design and be far easier to evaluate and govern.
Two questions to ask before you split an agent:
- Does the split actually reduce the prompt complexity of any one role?
- Can I evaluate each sub-agent independently?
If either answer is no, keep it as one agent.
When a workflow beats an agent
The cleanest split is not multi-agent — it is agent inside a workflow.
- A deterministic workflow handles ingest, classification, and routing.
- One (or occasionally two) agents handle the judgement-heavy steps.
- A deterministic workflow handles the side-effect-ful final step (send, post, commit).
This gives you the adaptability of agents where you need it, and the auditability of workflows where you do not.
Sub-agents
When you do split, the cleanest pattern is a main agent calling sub-agents as tools. The main agent decides when to delegate; the sub-agent has its own system prompt, its own toolkit, and returns a structured result.
There are two common shapes for this, and choosing between them is a consequential decision.
Simple routing
The orchestrator classifies intent and hands the conversation off. The chosen sub-agent owns the reply and talks to the user directly.
Good for clearly-partitioned domains (billing vs. support vs. onboarding) where the routing decision is cheap and the sub-agent's voice is fine. The orchestrator stays thin; each sub-agent is self-contained and cheap to evaluate in isolation.
Mediated orchestration
The orchestrator dispatches to sub-agents, collects their structured results, validates them, and composes the single reply to the user. Sub-agents never speak to the user directly.
Good when the answer draws on multiple sub-agents, when the voice and policy of the final reply matter, or when a central place is needed to enforce guardrails, redaction, or escalation. You pay more tokens and one more round-trip, but you get one chokepoint for oversight.
- One role needs a different model (e.g., a cheap Haiku classifier feeding a Sonnet planner).
- One role needs a different toolset (e.g., a browsing sub-agent isolated from your internal APIs).
- One role benefits from a tight context (fewer tools, fewer skills, shorter prompt).
- You use them for separation of concerns only. That is a code-organisation instinct, not a runtime one. It adds latency and tokens for no capability gain.
Multi-agent "teams"
Full multi-agent orchestration — many agents coordinating, possibly with persistent memory of each other — is a research-grade pattern. Enterprise platforms are starting to support it, but the governance burden jumps by an order of magnitude.
In REMIT terms: now you have many identities, each needing provenance; envelope boundaries between agents; inter-agent audit logs; and a trust graph, not just a trust level. It is possible, and for certain use cases (trading, supply-chain coordination) it is the only architecture that works. But for most teams, this is not where you start.