Orchestrator

Decision-plane component that runs the bounded Planner / Executor / Critic triad.

Reference DesignLast reviewed: 2026-05-04 Edit on GitHub

At a glance

Decision planeBounded execution loop

One Run Context's worth of execution: read CompiledContext, run the triad, emit a DecisionRecord.

Inputs

CompiledContext from the Compiler
Run Context with RunBudget (atomic accumulators)
DecisionSpec registry from the Decision Catalog
Tool registry with declared approval modes

Outputs

Typed Plan and step transcripts
DecisionRecord with evidence_refs, approvals, controls_active
Memory write proposals
OTEL trace bundle stitched across subagent lanes
Session checkpoints (in long_running mode)

Lifecycle

plan
verify
execute
score
consolidate

Canonical types

Plan
DecisionRecord
ToolCall
ToolResult
BackgroundSession

Reference Architecture

The Orchestrator is the bounded execution loop of the Decision plane. It runs the Planner / Executor / Critic triad, manages subagent lanes, and persists checkpoints for durable background sessions.

Definition

A coordinator component that owns one Run Context’s worth of execution. Reads the CompiledContext; runs the triad until the Critic emits a terminal verdict; produces a DecisionRecord. See Orchestration for the spec narrative.

Why it exists

A single agentic loop conflates planning, execution, and judgment, making failures uninterpretable. Splitting them gives the runtime three independently auditable artifacts (plan, transcript, verdict), enables replay, and lets each role enforce its own budget.

Inputs

CompiledContext from the Compiler
Run Context with RunBudget (atomic accumulators)
DecisionSpec registry from the Decision Catalog
Tool registry with declared approval modes

Outputs

Typed Plan and step transcripts
DecisionRecord with evidence_refs, approvals, controls_active
Memory write proposals
OTEL trace bundle stitched across subagent lanes
Session checkpoints (in long_running mode)

How it works

Plan — Planner reads CompiledContext; emits typed Plan with steps, tool intents, decision checkpoints.
Verify — Critic checks plan against tool allow-lists, evidence requirements, approval-mode declarations.
Execute — Executor runs verified steps via the Tool Manager. For network / delegated / destructive, splits into propose → approve → execute against a frozen evidence snapshot.
Score — Critic scores each completed step on the evaluators; renders accept / retry / replan / escalate.
Consolidate — extracts effects + evidence into memory write proposals.
Loop or terminate — re-plan attempts capped by RunBudget.max_replan_attempts.

Subagent lanes

Lanes are spawned with their own Run Context, token budget, tool surface, trace span.
Lane outputs return as typed envelopes; lane traces stitched into the parent.
Lanes cannot mutate parent effects[]; they propose results the parent’s Critic accepts or rejects.

Background sessions

A long_running mode persists a checkpoint after every Critic verdict.
Resumable by session_id against the same pinned pack and snapshot.
Operator interruptions enqueue a checkpoint resume.

Failure modes

Plan contains disallowed tools or missing approvals — caught by Critic at verify.
Tool retries cause duplicate side effects — mitigated by idempotency keys at the Tool Manager.
Subagent lane modifies parent effects — invariant violation; lane terminated.
Background session resumes against a different pack version — refuse with pack_version_mismatch.
Loop guard trips silently without a structured reason — bug; loop guard must always set loop_detected reason.

Operational concerns

Re-plan budgets per workflow; default 2, raised only with rationale.
Verification cost and strictness scale with risk_class.
Workflow timeouts and SLA enforcement at the Run Context boundary.
Subagent lane fan-out limits to bound cost and latency.
Background session checkpoint storage and TTL by tenant.

Evaluation metrics

Plan-verification pass rate.
Step completion rate.
Retry and recovery rate.
Escalation rate by risk tier.
Subagent lane success rate vs. parent re-plan rate.
Mean time to safe completion.