Orchestrator
Decision-plane component that runs the bounded Planner / Executor / Critic triad.
One Run Context's worth of execution: read CompiledContext, run the triad, emit a DecisionRecord.
- CompiledContext from the Compiler
- Run Context with RunBudget (atomic accumulators)
- DecisionSpec registry from the Decision Catalog
- Tool registry with declared approval modes
- Typed Plan and step transcripts
- DecisionRecord with evidence_refs, approvals, controls_active
- Memory write proposals
- OTEL trace bundle stitched across subagent lanes
- Session checkpoints (in long_running mode)
- plan
- verify
- execute
- score
- consolidate
- Plan
- DecisionRecord
- ToolCall
- ToolResult
- BackgroundSession
The Orchestrator is the bounded execution loop of the Decision plane. It runs the Planner / Executor / Critic triad, manages subagent lanes, and persists checkpoints for durable background sessions.
Definition
A coordinator component that owns one Run Context’s worth of execution. Reads the CompiledContext; runs the triad until the Critic emits a terminal verdict; produces a DecisionRecord. See Orchestration for the spec narrative.
Why it exists
A single agentic loop conflates planning, execution, and judgment, making failures uninterpretable. Splitting them gives the runtime three independently auditable artifacts (plan, transcript, verdict), enables replay, and lets each role enforce its own budget.
Inputs
CompiledContextfrom the Compiler- Run Context with
RunBudget(atomic accumulators) DecisionSpecregistry from the Decision Catalog- Tool registry with declared approval modes
Outputs
- Typed
Planand step transcripts DecisionRecordwith evidence_refs, approvals, controls_active- Memory write proposals
- OTEL trace bundle stitched across subagent lanes
- Session checkpoints (in
long_runningmode)
How it works
- Plan — Planner reads
CompiledContext; emits typedPlanwith steps, tool intents, decision checkpoints. - Verify — Critic checks plan against tool allow-lists, evidence requirements, approval-mode declarations.
- Execute — Executor runs verified steps via the Tool Manager. For
network/delegated/destructive, splits into propose → approve → execute against a frozen evidence snapshot. - Score — Critic scores each completed step on the evaluators; renders
accept/retry/replan/escalate. - Consolidate — extracts effects + evidence into memory write proposals.
- Loop or terminate — re-plan attempts capped by
RunBudget.max_replan_attempts.
Subagent lanes
- Lanes are spawned with their own Run Context, token budget, tool surface, trace span.
- Lane outputs return as typed envelopes; lane traces stitched into the parent.
- Lanes cannot mutate parent
effects[]; they propose results the parent’s Critic accepts or rejects.
Background sessions
- A
long_runningmode persists a checkpoint after every Critic verdict. - Resumable by
session_idagainst the same pinned pack and snapshot. - Operator interruptions enqueue a checkpoint resume.
Failure modes
- Plan contains disallowed tools or missing approvals — caught by Critic at verify.
- Tool retries cause duplicate side effects — mitigated by idempotency keys at the Tool Manager.
- Subagent lane modifies parent effects — invariant violation; lane terminated.
- Background session resumes against a different pack version — refuse with
pack_version_mismatch. - Loop guard trips silently without a structured reason — bug; loop guard must always set
loop_detectedreason.
Operational concerns
- Re-plan budgets per workflow; default 2, raised only with rationale.
- Verification cost and strictness scale with
risk_class. - Workflow timeouts and SLA enforcement at the Run Context boundary.
- Subagent lane fan-out limits to bound cost and latency.
- Background session checkpoint storage and TTL by tenant.
Evaluation metrics
- Plan-verification pass rate.
- Step completion rate.
- Retry and recovery rate.
- Escalation rate by risk tier.
- Subagent lane success rate vs. parent re-plan rate.
- Mean time to safe completion.