Skip to content
Back to Blog
Product management series
May 13, 2026
·by Piyush·7 min read

The Control Tower Pattern: How PMs Should Design Multi-Agent Products

ContextOS
Product Management
Multi-Agent Systems
Orchestration
Agents
Share:XHN

Multi-agent systems are easy to draw and hard to operate.

The common failure is an org chart made of prompts:

Research Agent -> Planning Agent -> Execution Agent -> QA Agent

It looks sophisticated. It often creates more latency, more unclear ownership, more context loss, and more places where nobody knows why the final answer happened.

Product managers need a stricter rule:

Add another agent only when the product needs a separate context, authority, tool surface, or scorecard.

Use the airport analogy again. A control tower does not create a “landing agent,” “runway agent,” and “weather agent” because the diagram looks cleaner. It separates responsibilities because the work has different data, timing, authority, and failure modes.

That is the ContextOS view of multi-agent products: a parent orchestrator, specialist lanes, a Critic, a Tool Gateway, and one final receipt.

Workflow, agent, or multi-agent?

Before designing a multi-agent product, decide the runtime shape.

ShapeProduct fitPM warning
Single callShort, low-risk answerDo not overbuild
Fixed workflowKnown steps, predictable handoffsBetter than “agent” for many products
Planner / Executor / CriticAdaptive tool use and recovery neededRequires trace and budget discipline
Orchestrator + lanesParallel or specialized work creates measurable valueMust preserve one owner for final decision
Long-running sessionWork spans hours, days, or systemsRequires checkpoints and progress contracts

Anthropic’s effective-agent guidance makes the same practical distinction: workflows use predefined code paths, while agents dynamically direct tool use. The PM implication is simple: do not buy autonomy you cannot score.

The control tower pattern

In ContextOS, a complex multi-agent system should look like this:

Parent Orchestrator
  owns intent, RunContext, budget, final DecisionRecord
 
Specialist Lanes
  run bounded subtasks with scoped tools and context
 
Critic
  verifies plans, scores lane outputs, accepts or rejects synthesis
 
Tool Gateway
  enforces schemas, policy, approval modes, and audit
 
DecisionRecord
  records final outcome, evidence, approvals, trace, replay handle

The parent orchestrator is the control tower. Specialist agents are crews. Crews can inspect and prepare. They do not clear the runway.

When to split into specialist lanes

Use this test:

Split conditionExample
Different evidence setContract review needs signed terms; billing needs SKU catalog
Different tool surfaceCompliance can call screening tools; comms can draft emails
Different risk classIntake is read-only; payment activation is destructive
Different evaluatorLegal accuracy and customer tone need different rubrics
Parallelizable workKYC, contract extraction, and environment setup can run concurrently
Different ownerLegal, finance, support, and implementation have separate accountability

If none of these are true, keep it in one workflow.

Multi-agent product anti-patterns

Anti-patternWhy it failsBetter pattern
Agent per departmentMirrors org politics, not work boundariesIntent and evidence-based lanes
Worker can mutate final stateNo single accountable decisionParent accepts worker output before effects
Every worker sees everythingContext bloat and leakageScoped Context Pack per lane
Agent debate without evidenceMore tokens, same uncertaintyRequire evidence refs and Critic verdicts
No lane-specific evalsCannot tell which specialist regressedScore by lane and final outcome
Shared tool poolRisk bleed across lanesTool Gateway per lane authority

The PM should reject multi-agent diagrams that do not show authority, evidence, and final ownership.

Worked example: enterprise renewal desk

Goal:

Help account teams prepare, approve, and send enterprise renewal proposals.

The naive product idea:

A renewal agent that handles renewals.

The control tower version:

LaneJobContextToolsRisk
Account IntakeNormalize account, renewal date, ownersCRM, account notesread CRMread_only
Usage AnalystAnalyze adoption and expansion signalsproduct analyticsquery metricsnetwork
Contract ReviewerExtract terms, renewal clauses, restrictionscontract reporead contractsread_only
Pricing SpecialistDraft pricing optionsprice book, discount policycreate quote draftlocal_write
Risk ReviewerIdentify churn, legal, and finance riskshistory, exceptionspolicy evalnetwork
Comms DrafterDraft customer-facing renewal narrativeapproved factsdraft emaillocal_write
Deal Desk GateApprove discount or non-standard termsfull packetapproval gatedestructive

The parent orchestrator owns the renewal packet and final DecisionRecord.

The PM spec for each lane

Each specialist lane needs a mini-spec:

lane: pricing_specialist
parent_intent: renewal.prepare_proposal
mission: draft pricing options and discount rationale
context_pack:
  required:
    - account_tier
    - current_contract_value
    - usage_trend
    - approved_price_book
    - discount_policy
tools:
  allowed:
    - pricebook.lookup
    - quote.create_draft
  denied:
    - quote.send_to_customer
approval_mode: local_write
output:
  type: pricing_recommendation
  fields:
    - recommended_package
    - discount_percent
    - rationale
    - evidence_refs
evals:
  - discount_policy_compliance
  - margin_floor_preserved
  - rationale_evidence_coverage

If a lane cannot be specified this way, it is not ready to be a separate agent.

Parent orchestration rules

The parent orchestrator should have rules like:

  • It may spawn lanes only from approved task templates.
  • It must pass each lane a scoped RunContext.
  • It must set lane budgets.
  • It must reject lane outputs without required evidence refs.
  • It must not let lane outputs directly produce side effects.
  • It must synthesize one final plan.
  • It must produce one final DecisionRecord.

This is Orchestration, not “coordination by vibes.”

The Critic is the product safety net

The Critic is not a “QA agent” bolted on at the end.

It verifies:

CheckProduct question
Plan validityIs this path allowed for the intent?
Evidence sufficiencyDo we have the facts needed to decide?
Tool authorizationAre these tools allowed for this RunContext?
Approval modeIs the right gate required before side effects?
Lane qualityDid each specialist return a typed, usable result?
Final receiptDoes the DecisionRecord explain the work?

For PMs, the Critic is where many acceptance criteria become executable.

Context management for multi-agent products

Do not share one giant prompt across all agents.

Use per-lane context:

Context strategyPM meaning
Up-front briefingStable mission, policy, owner, output shape
Just-in-time retrievalLet lane fetch specific evidence when needed
CompactionPreserve decisions and open questions, drop raw chatter
Structured notesPersist progress outside the context window
Parent summaryReturn typed output, not full lane transcript

This follows the practical lesson from context engineering: context is finite and should be treated as an attention budget.

Product metrics for multi-agent systems

Do not only measure final task success.

Measure the system shape:

MetricWhy it matters
Lane spawn rateDetects unnecessary decomposition
Lane acceptance rateShows whether specialists produce useful artifacts
Parent rejection reasonsReveals unclear lane contracts
Cross-lane contradiction rateShows context or policy conflicts
Tool denial rate by laneReveals authority mismatch
Critical path latencyMeasures whether parallelism actually helps
Final DecisionRecord completenessDetermines audit readiness

If multi-agent architecture does not improve utility, latency, or risk control, remove it.

Rollout path

Roll out multi-agent systems by lanes:

  1. Shadow the parent workflow with no lane side effects.
  2. Enable one read-only lane.
  3. Add lane-specific scorecards.
  4. Enable parallel lanes only after trace review shows value.
  5. Add delegated actions behind approval gates.
  6. Add destructive paths last, with rollback rehearsed.

The safest launch is not “all agents on.” It is “one lane earns trust at a time.”

PM checklist

Before approving a multi-agent design, ask:

  • Why is a fixed workflow not enough?
  • Which lanes have different context, tools, risk, or evals?
  • Who owns each lane?
  • What typed artifact does each lane return?
  • Can the parent reject a lane output?
  • Which lane can create side effects?
  • Which approval gates apply?
  • What is the final DecisionRecord?
  • Which trace shows the full parent/child path?
  • What metric proves multi-agent is better than single-agent?

If the diagram cannot answer these questions, it is not architecture. It is decoration.

Found this useful? Share it.

Share:XHN
Analytics consent

We use Google Analytics to understand site usage. You can opt in or decline.