3 Multi-Agent System Build Cases — Code Review, CS, and Pipeline Patterns

Table of Contents

claude --agents '{
  "code-reviewer": { "tools": ["Read","Grep","Glob","Bash"], "model": "sonnet" },
  "debugger":      { "prompt": "Analyze errors, identify root causes." }
}'

A single command like this spins up both a dedicated code review agent and a debugging agent simultaneously. The reason most teams look into multi-agent system build cases is straightforward: a single LLM call struggles to handle tasks with fundamentally different natures — such as "code quality inspection → security vulnerability analysis → refactoring suggestions" — all at once. Splitting these into separate agents means each operates with its own independent context and tool permissions, and only the results need to be combined.

This article covers three production architectures: automated code review (Claude Code sub-agents), CS auto-response (Coordinator pattern), and data pipelines (OpenAI Agents SDK). Each case examines which multi-agent design pattern applies, what the trade-offs are, and includes code examples.

5 Multi-Agent Design Patterns — Choosing the Right Structure

![multi-agent-design-patterns]({{image:multi-agent-design-patterns|Technical diagram style illustration showing five distinct network topologies arranged in a grid: a linear chain, a branching parallel fork, a hub-and-spoke wheel, a layered hierarchy, and a fully connected mesh, each rendered as clean interconnected nodes with directional arrows on a white background with single indigo accent color, landscape orientation, no text, no letters, no words, no characters}})

The first decision when designing a multi-agent system is the communication structure between agents. The Google Cloud Agentic AI Design Pattern Guide defines five patterns.

Sequential — Step-by-Step Execution

A pipeline structure where Agent A’s output becomes Agent B’s input. It’s well-suited for workflows with fixed stages like data preprocessing → analysis → report generation. Implementation is simple, but a slow upstream stage blocks the entire pipeline.

Parallel — Concurrent Execution with Aggregation

The same input is sent to multiple agents simultaneously, and results are aggregated or selected by vote. Running security checks and style checks concurrently during code review is a typical example.

Coordinator — Dynamic Routing by a Central AI

A central AI agent analyzes incoming requests, decomposes them into subtasks, and dynamically routes them to specialized agents. This is flexible but increases model calls, raising both latency and operational costs compared to a single agent. It’s well-suited for systems like CS auto-response where input types are diverse.

Hierarchical Task Decomposition

A higher-level agent decomposes tasks, lower-level agents process them, and the higher-level agent synthesizes the results. Similar to Coordinator, but the decomposition depth goes two or more levels deep.

Swarm — All-to-All Communication

All agents communicate with each other and iteratively refine outputs. This is the most complex and expensive pattern, with a risk of unproductive loops without proper termination conditions.

Pattern Latency Token Consumption Complexity Suitable Scenarios
Sequential High (serial) Low Low Fixed pipelines, ETL
Parallel Low (concurrent) Medium Low Code review, multi-validation
Coordinator Medium High Medium CS routing, dynamic branching
Hierarchical Medium–High High High Complex research, large-scale decomposition
Swarm Indeterminate Very high Very high Creative tasks, iterative refinement
Deterministic vs Dynamic Workflow Selection
When the input format is fixed, Sequential or Parallel patterns are appropriate. When input types vary and execution paths need to diverge, choose Coordinator or Hierarchical patterns.

The Review & Critique pattern has a generator agent produce output while a critic agent validates it against defined criteria. The Human-in-the-Loop pattern pauses execution at critical decision points for human approval. Both patterns can be combined with any of the five above.

Case 1 — Automated Code Review: Claude Code Sub-Agents

Claude Code sub-agents each run with an independent context window, custom system prompt, and specific tool access permissions. When the main agent delegates a task, the sub-agent works independently and returns only the result. Multiple sub-agents can run in parallel, and built-in sub-agent types include read-only Explore, planning-focused Plan, and general-purpose.

Applying the Parallel pattern to automated code review produces the following structure.

claude --agents '{
  "code-reviewer": {
    "description": "Expert code reviewer. Use proactively after code changes.",
    "prompt": "You are a senior code reviewer. Focus on code quality, security, and best practices.",
    "tools": ["Read", "Grep", "Glob", "Bash"],
    "model": "sonnet"
  },
  "debugger": {
    "description": "Debugging specialist for errors and test failures.",
    "prompt": "You are an expert debugger. Analyze errors, identify root causes, and provide fixes."
  }
}'

code-reviewer can only access Read, Grep, Glob, and Bash tools. debugger has no tool restrictions and can use all available tools. When both agents run simultaneously, code quality checks and error analysis are processed in parallel. The key principle in this structure is tool permission separation — withholding Write permission from the reviewer ensures it can only read code, never modify it.

Extending with the Coordinator Pattern

As the number of sub-agents grows, a coordinator pattern can restrict which sub-agent types are allowed to be spawned. Specifying Agent(worker, researcher) in the tools field limits creation to only those types.

---
name: coordinator
description: Coordinates work across specialized agents
tools: Agent(worker, researcher), Read, Bash
---

In this configuration, the coordinator can only spawn worker and researcher types. Sub-agents cannot recursively spawn other sub-agents, which prevents unbounded agent proliferation at the architectural level.

Sub-Agents vs Agent Teams
Sub-agents operate within a single session. For multiple agents that run in parallel and communicate with each other, agent teams are required. Agent teams enable coordination across separate sessions.

Combining the Parallel pattern with coordinator restrictions in automated code review creates this flow: the main agent reads the PR diff → delegates simultaneously to code-reviewer, security-checker, and style-linter sub-agents → the coordinator synthesizes the results. This completes the Claude Code sub-agent architecture.

Case 2 — CS Auto-Response: Coordinator Pattern Architecture

![coordinator-pattern-cs-architecture]({{image:coordinator-pattern-cs-architecture|Editorial illustration of a central robotic dispatcher figure standing at a crossroads, holding multiple glowing envelopes and routing each one down diverging conveyor paths toward specialized workstation booths, warm industrial lighting with teal and amber tones, shallow depth of field, landscape orientation, no text, no letters, no words, no characters}})

CS auto-response systems face diverse input types. Refund requests, shipping inquiries, technical support questions, and general inquiries all arrive through the same channel. A Sequential pattern can’t handle this. It’s a textbook scenario for the Coordinator pattern.

In the Google Cloud multi-agent reference architecture, the coordinator agent manages sub-agents via the Agent2Agent (A2A) protocol and standardizes tool access through MCP (Model Context Protocol). Runtime options include Cloud Run, GKE, or Vertex AI Agent Engine.

CS System Architecture

[User Input]
    │
    ▼
┌──────────────┐
│ Coordinator  │  ← Request classification + routing
│   Agent      │
└──────┬───────┘
       │  A2A Protocol
  ┌────┼────┬────────┐
  ▼    ▼    ▼        ▼
[Refund][Shipping][Tech Support][General]
Agent   Agent     Agent         Agent
  │     │         │             │
  └─────┴─────────┴─────────────┘
       │  MCP
       ▼
  [DB / API / CRM]

The Coordinator Agent analyzes the user message to identify intent and routes it to the appropriate specialized agent. Each specialized agent accesses tools like DB queries, API calls, and CRM updates through MCP.

Security Requirements

The A2A protocol mandates HTTPS in production environments, with TLS 1.2+ support and OAuth2/OpenID Connect authentication passed via HTTP headers. Since CS systems handle customer personal information, this security layer isn’t optional.

Input/Output Inspection with Model Armor
The Google Cloud reference architecture uses Model Armor for agent input/output security inspection. It blocks cases where customers attempt prompt injection or agents generate responses containing personal information.

The trade-offs of this structure are clear. Since the Coordinator classifies every request first, the minimum model call count is two per request (classification + processing). For services where 80% of inquiries are simple, a hybrid design placing a lightweight classifier upfront and routing only the remainder to the Coordinator is more cost-efficient.

Case 3 — Data Pipeline: OpenAI Agents SDK

The OpenAI Agents SDK is a lightweight framework for multi-agent workflows that delegates tasks through handoffs between agents. Key features include agent configuration (instructions, tools, guardrails, handoffs), sandbox agents (isolated filesystem and command execution environments), session management, MCP integration, and support for 100+ LLMs.

Applying the Sequential pattern to a data pipeline means separating the collection → cleaning → loading stages into independent agents. Sandbox agents isolate the filesystem and command execution environment, ensuring that data cleaning scripts don’t affect the host system.

from agents import Runner
from agents.sandbox import Manifest, SandboxAgent, SandboxRunConfig
from agents.sandbox.entries import GitRepo
from agents.sandbox.sandboxes import UnixLocalSandboxClient

agent = SandboxAgent(
    name="Workspace Assistant",
    instructions="Inspect the sandbox workspace before answering.",
    default_manifest=Manifest(
        entries={
            "repo": GitRepo(repo="openai/openai-agents-python", ref="main"),
        }
    ),
)

result = Runner.run_sync(
    agent,
    "Inspect the repo README and summarize what this project does.",
    run_config=RunConfig(sandbox=SandboxRunConfig(client=UnixLocalSandboxClient())),
)

In this code, SandboxAgent clones a Git repo into a sandbox workspace and analyzes files in an isolated environment. UnixLocalSandboxClient runs the sandbox on a local Unix environment. For data pipelines, this pattern extends so that a collection agent calls external APIs to fetch data, and a cleaning agent runs preprocessing scripts within the sandbox.

Handoff-Based Task Delegation

The core concept of the OpenAI Agents SDK is the handoff. When Agent A completes its work, it passes control along with context to Agent B, which handles the next stage. This is a natural implementation of the Sequential pattern. Setting guardrails at handoff points enables validation that data passed to the next agent conforms to the expected schema.

Separating agents in a data pipeline offers three practical benefits.

  1. Fault isolation — If the cleaning agent fails, the collection agent’s output is preserved
  2. Independent scaling — If collection is the bottleneck, only the collection agent needs to be replicated
  3. Model separation — Lightweight models handle simple ETL, while high-performance models are assigned to unstructured data interpretation

The code above is an example from the OpenAI Agents SDK GitHub repository README. The platform.openai.com documentation was returning 403 at the time of writing, so detailed content from the official Agents SDK guide was not separately verified.

The Role of A2A and MCP in Multi-Agent Systems

When multi-agent system build cases move into production, communication standards between agents become essential. Running agents built with different frameworks (Claude Code, OpenAI Agents SDK, Google ADK) in a single system requires a common protocol.

A2A (Agent2Agent) Protocol

The A2A protocol provides interoperability between agents running on different programming languages and runtimes. When a data cleaning agent written in Python and a notification agent written in TypeScript need to operate in the same pipeline, A2A bridges the gap.

Security requirements for production environments:

  • HTTPS mandatory
  • TLS 1.2+ support
  • OAuth2/OpenID Connect authentication passed via HTTP headers

Concrete implementation code examples for the A2A protocol are not yet available in the official documentation. Currently, only protocol specifications and reference architecture-level guides are provided, so the protocol specification must be referenced directly during implementation.

MCP (Model Context Protocol)

MCP standardizes how agents access external tools. In the Google Cloud multi-agent reference architecture, the coordinator agent uses A2A to manage sub-agents and MCP to standardize tool access — a dual-protocol structure.

Protocol Role Communication Target
A2A Agent ↔ Agent Between agents on different runtimes
MCP Agent → Tool External resources like DBs, APIs, filesystems

The two protocols serve different roles. A2A handles horizontal communication between agents, while MCP handles vertical access from agents to tools. Using only A2A in a multi-agent system means each agent implements tool access differently. Using only MCP means agent-to-agent routing must be built from scratch.

Comparing Three Architectures — Pattern, Cost, and Complexity

Here’s a comparison of the three build cases covered above by pattern, cost, and complexity.

Item Code Review (Claude Code) CS Auto-Response (Coordinator) Data Pipeline (OpenAI SDK)
Design pattern Parallel + Coordinator Coordinator Sequential
Agent count 2–4 4–6 2–3
Model calls per request 2–4 2–3 2–3
Communication method Within single session A2A + MCP Handoff
Primary trade-off Increased token consumption during parallel execution Added cost of classification layer Sequential execution latency
Fault isolation Independent per sub-agent Independent per agent Independent per stage

All three cases share the same core benefit of agent separation: each agent has independent context and tool permissions, so a single agent’s failure doesn’t bring down the entire system. The differences lie in communication structure and orchestration complexity.

Google Cloud’s official documentation provides seven multi-agent deployment examples: financial advisor (stock trade recommendations), research assistant (plan-collect-evaluate), insurance agent (enrollment and claims processing), search optimizer, data analyzer, web marketing agent, and Airbnb planner. These demonstrate that multi-agent applicability extends well beyond the three cases covered here.

Pattern Selection Criteria

Simplified selection rules:

  • Fixed input format → Sequential or Parallel
  • Diverse input types requiring branching → Coordinator
  • Task decomposition depth of 2+ levels → Hierarchical
  • Inter-agent feedback required → Swarm (watch costs)
  • Output validation required → Combine with Review & Critique
  • High-risk decisions → Combine with Human-in-the-Loop

In practice, pattern combinations are more common than single patterns. Mixing Parallel + Coordinator as in the code review case, or adding Review & Critique to a data pipeline to validate data quality before loading, represents realistic production architecture.

Common Trade-Offs and Considerations When Building Multi-Agent Systems

Quantitative performance metrics from actual enterprise production multi-agent deployments are not available in official documentation. The trade-offs below are therefore analyzed at the architectural design level.

Cost increase: As the number of agents grows, model call counts increase proportionally. The Coordinator pattern requires a minimum of two calls (classification + processing), while Swarm calls scale with iteration count. A model tiering strategy — assigning low-cost models to classification and reserving high-performance models for final processing — is necessary.

Latency: In the Parallel pattern, the slowest agent’s response time determines overall latency. In the Sequential pattern, total latency equals the sum of all agents’ response times. For CS systems requiring real-time responses, individual timeouts must be configured per agent.

Debugging complexity: With three or more agents, tracing which agent produced incorrect results becomes difficult. An observability design that logs each agent’s input/output and assigns tracing IDs per request must be established upfront.

Runtime Selection Guide
On Google Cloud, choose among Cloud Run (serverless, small scale), GKE (containers, large scale), or Vertex AI Agent Engine (managed) based on traffic volume and operational capacity. Each option has different autoscaling characteristics, so the decision should align with agent response time SLAs.

Comparative resources covering multi-agent patterns across Anthropic, OpenAI, and Google platforms in Korean are virtually nonexistent. All official documentation is in English, so direct reference to the original documents is necessary for production deployment.

Next Steps — Advanced Multi-Agent Orchestration

With the basic structure of multi-agent system build cases established, three areas warrant deeper exploration. First is building an MCP server to give agents standardized access to internal databases and APIs. A single MCP server enables both Claude Code sub-agents and OpenAI Agents SDK agents to connect through the same tool interface.

Second is session management design that minimizes context loss during inter-agent handoffs. As the number of agents increases, a validation layer is needed to ensure intermediate results from one agent are fully transmitted to the next. Third is building an observability pipeline to monitor agent execution costs and track token usage per agent. As understanding of multi-agent orchestration patterns and AI agent design patterns deepens, complex workflows that a single agent can’t handle become reliably operable.

Scroll to Top