The Planner is creative. The Executor is mechanical. The Critic is the part that says “no” or “yes, with caveats.”
Most agent stacks I have audited ship without a Critic. The Planner emits a plan, the Executor runs it, and the result is the answer. That works for a while. It stops working the day a plan slips past with a missing evidence binding, or with a destructive step nobody approved, or with an output that does not satisfy the contract the team thought it did. By the time someone notices, two hundred runs have shipped on the same broken pattern.
The Critic is not a chat bot reviewing the model’s output. It is three small functions running at fixed points in the loop, each one returning a typed verdict that lands in the DecisionRecord. This post is those three functions in code. The canonical loop spec is in Cognitive Core; this post is the build-along.
Where the Critic sits
Planner → Critic.verify → Executor → Critic.score → Critic.consolidate → DecisionRecord
(pre-execute) (post-execute) (final report)Three calls. The first refuses bad plans before they touch the world; the second scores the result against the contract; the third packages everything into the typed report the loop emits.
The shared shapes
Every Critic call returns a typed envelope. The envelopes compose into one report:
import type { Plan, ToolCall, ToolResult, DecisionSpec, RunContext } from "@/types"
import type { Scorecard } from "@/evals/types"
export type VerifyVerdict =
| { ok: true; reasons: string[] }
| { ok: false;
kind: "missing_evidence" | "approval_mode_mismatch" | "violates_decision_spec"
| "loop_guard" | "budget_exceeded";
reasons: string[];
offending_step?: number }
export type ScoreVerdict = {
ok: boolean // any hard-fail evaluator → false
scorecard: Scorecard
}
export type CriticReport = {
trace_id: string
decision_key: string
verify: VerifyVerdict
score?: ScoreVerdict // absent if verify failed before execution
status: "completed" | "refused_by_critic" | "partial"
rationale: string
decided_at: string
}The shape is the contract. Everything downstream — the DecisionRecord, the Improvement Loop’s FeedbackEntry, the replay harness’s diff — reads these envelopes.
Phase 1 — verify (pre-execution)
verify is what stops a bad plan from reaching the Executor. It runs after the Planner emits a plan and before any tool call fires. Five checks; each one a typed refusal:
import type { Plan, RunContext, DecisionSpec } from "@/types"
import type { CompiledContext } from "@/types"
import type { VerifyVerdict } from "./types"
import { MODE_RANK } from "@/tools/types"
export function verify(
ctx: RunContext,
compiled: CompiledContext,
spec: DecisionSpec,
plan: Plan,
): VerifyVerdict {
const reasons: string[] = []
// 1. Plan must declare an output that satisfies the DecisionSpec contract
for (const required of spec.required_outputs) {
if (!plan.declared_outputs.includes(required)) {
return {
ok: false,
kind: "violates_decision_spec",
reasons: [`plan does not produce required output ${required}`],
}
}
}
// 2. Every tool step the plan proposes must be in the surfaced tool manifest
const surfaced = new Set(
compiled.manifests.tool_manifest.map((t) => `${t.adapter_id}.${t.capability_id}`),
)
for (const [i, step] of plan.steps.entries()) {
if (step.kind !== "tool") continue
const key = `${step.adapter_id}.${step.capability_id}`
if (!surfaced.has(key)) {
return {
ok: false,
kind: "violates_decision_spec",
reasons: [`plan step ${i} calls ${key} which is not in the surface`],
offending_step: i,
}
}
}
// 3. Approval mode of every destructive/delegated step does not exceed safety_mode
for (const [i, step] of plan.steps.entries()) {
if (step.kind !== "tool") continue
const cap = compiled.manifests.tool_manifest.find(
(t) => t.adapter_id === step.adapter_id && t.capability_id === step.capability_id,
)!
if (MODE_RANK[cap.approval_mode] > MODE_RANK[ctx.safety_mode]) {
return {
ok: false,
kind: "approval_mode_mismatch",
reasons: [`step ${i} mode ${cap.approval_mode} > safety_mode ${ctx.safety_mode}`],
offending_step: i,
}
}
}
// 4. Every step that demands evidence has the evidence pinned in the plan
for (const [i, step] of plan.steps.entries()) {
for (const required_class of step.requires_evidence ?? []) {
const have = (step.evidence_refs ?? []).some(
(r) => compiled.manifests.evidence_manifest.find((e) => e.id === r)?.classification
// @ts-ignore — class encoded in id e.g. "kg:order:..."
|| r.split(":")[1] === required_class,
)
if (!have) {
return {
ok: false,
kind: "missing_evidence",
reasons: [`step ${i} requires evidence class ${required_class}, none pinned`],
offending_step: i,
}
}
}
}
// 5. Loop guard — total step count and total budget within bounds
if (plan.steps.length > (ctx.run_budget?.max_steps ?? 12)) {
return {
ok: false,
kind: "loop_guard",
reasons: [`plan has ${plan.steps.length} steps, exceeds max_steps`],
}
}
if (sumStepBudgets(plan) > (ctx.run_budget?.bucket_tokens ?? 8000)) {
return { ok: false, kind: "budget_exceeded", reasons: ["plan exceeds bucket_tokens"] }
}
reasons.push(`${plan.steps.length} steps, ${spec.required_outputs.length} required outputs covered`)
return { ok: true, reasons }
}
function sumStepBudgets(p: Plan): number {
return p.steps.reduce((a, s) => a + (s.estimated_tokens ?? 0), 0)
}Five checks earn their keep:
Decision-spec coverage — the plan must declare it will produce every required output. If the spec says a refund decision must produce refund_amount_inr and refund_reason_class, and the plan does not declare both, the Critic refuses. The contract is enforced before execution, not after.
Surface containment — every tool step must be in the surface stage 3 of the compiler produced. A plan that names payments.bulk_refund when only payments.refund was surfaced is a refusal, not a runtime error.
Approval-mode within safety mode — the plan cannot escalate the run’s safety profile. A local_write run cannot include a destructive step, period.
Evidence binding — every step that requires evidence has the evidence pinned by ref. Not “the evidence will be retrieved at run time”; the ref is in the plan.
Loop guard — step count and budget are checked before execution. A 60-step plan does not get a chance to run; it gets refused with a typed verdict.
The Critic refuses before any side effect. That is the verb-tense difference between “we caught a bad plan” and “we recovered from a bad plan.”
Phase 2 — score (post-execution)
score is what runs after the Executor finishes. It feeds the run into the five evaluators, gets back a Scorecard, and decides whether the result is acceptable:
import type { Run } from "@/types"
import type { ScoreVerdict } from "./types"
import { scoreRun } from "@/evals/scorecard"
export function score(run: Run): ScoreVerdict {
const sc = scoreRun(run)
// Hard-fail evaluators (Policy, Safety) take "fail" → ok=false
const hardFailed =
sc.scores.policy.status === "fail" ||
sc.scores.safety.status === "fail"
return { ok: !hardFailed, scorecard: sc }
}The function is small because it delegates. The five evaluators in Wiring the Five Evaluators do the real scoring. The Critic’s score is a thin wrapper that names which evaluators are hard-fail and produces the typed verdict.
Phase 3 — consolidate
consolidate packages the two prior verdicts into the final CriticReport:
import type { VerifyVerdict, ScoreVerdict, CriticReport } from "./types"
export function consolidate(args: {
trace_id: string
decision_key: string
verify: VerifyVerdict
score?: ScoreVerdict
}): CriticReport {
const { verify, score } = args
if (!verify.ok) {
return {
trace_id: args.trace_id,
decision_key: args.decision_key,
verify,
status: "refused_by_critic",
rationale: `verify failed: ${verify.kind} — ${verify.reasons.join("; ")}`,
decided_at: new Date().toISOString(),
}
}
if (!score) {
// verify passed but no execution happened (dry run / shadow)
return {
trace_id: args.trace_id,
decision_key: args.decision_key,
verify,
status: "partial",
rationale: "verify passed; no score (no execution)",
decided_at: new Date().toISOString(),
}
}
if (!score.ok) {
return {
trace_id: args.trace_id,
decision_key: args.decision_key,
verify,
score,
status: "refused_by_critic",
rationale: hardFailReason(score),
decided_at: new Date().toISOString(),
}
}
return {
trace_id: args.trace_id,
decision_key: args.decision_key,
verify,
score,
status: "completed",
rationale: "verify and score passed",
decided_at: new Date().toISOString(),
}
}
function hardFailReason(s: ScoreVerdict): string {
const reasons: string[] = []
if (s.scorecard.scores.policy.status === "fail") {
reasons.push(`policy fail: ${s.scorecard.scores.policy.findings.map((f) => f.message).join(", ")}`)
}
if (s.scorecard.scores.safety.status === "fail") {
reasons.push(`safety fail: ${s.scorecard.scores.safety.findings.map((f) => f.message).join(", ")}`)
}
return reasons.join("; ") || "score failed (unknown reason)"
}The consolidator is the part that produces the human-readable rationale on the DecisionRecord. When the audit asks “why did the Critic refuse this run?”, the rationale string is the answer — typed kind plus one-line reason. Not a paragraph; a short string the on-call engineer can act on.
The loop integration
Wiring the three Critic phases into the cognitive loop is twelve lines:
// harness/runtime/loop.ts (excerpt)
import { verify } from "@/critic/verify"
import { score } from "@/critic/score"
import { consolidate } from "@/critic/consolidate"
export async function runCanonicalLoop(args: {
ctx: RunContext;
compiled: CompiledContext;
spec: DecisionSpec;
invoke: { req: InvokeRequest };
}) {
const plan = await runPlanner(args.compiled, args.invoke.req)
const verifyV = verify(args.ctx, args.compiled, args.spec, plan)
if (!verifyV.ok) {
return finalize(consolidate({ trace_id: args.ctx.trace_id, decision_key: args.spec.id, verify: verifyV }))
}
const run = await runExecutor(args.compiled, plan, args.ctx)
const scoreV = score(run)
const report = consolidate({ trace_id: args.ctx.trace_id, decision_key: args.spec.id, verify: verifyV, score: scoreV })
return finalize(report)
}finalize() writes the DecisionRecord from the report and the executor’s outputs. The Critic is the only thing that decides whether the run completed; the executor produces facts, but it does not produce verdicts.
A worked refund: one fail, one pass
Two plans for the same intent. The first fails verify; the second passes both phases.
Plan A — missing evidence binding. The Planner emits:
{
"steps": [
{ "kind": "tool", "adapter_id": "adp_orders", "capability_id": "lookup", "args": {"id": "ord_881"} },
{ "kind": "tool", "adapter_id": "adp_payments", "capability_id": "issue_refund",
"args": {"id": "pay_8861", "amount_inr": 24500},
"requires_evidence": ["refund_window_evidence"]
// NO evidence_refs pinned
}
],
"declared_outputs": ["refund_amount_inr", "refund_reason_class"]
}verify returns:
{
"ok": false,
"kind": "missing_evidence",
"reasons": ["step 1 requires evidence class refund_window_evidence, none pinned"],
"offending_step": 1
}The Executor never runs. The CriticReport lands in the DecisionRecord with status: "refused_by_critic" and the rationale above. No refund is issued; no side effects fire.
Plan B — correctly bound. Same plan with evidence_refs: ["kg:refund_window:rw_881#snapshot_kg_2026_05_09_T0930"] on step 1. verify passes:
{ "ok": true, "reasons": ["2 steps, 2 required outputs covered"] }The Executor runs, produces a Run. score runs the five evaluators; all five pass with score >= 0.9. consolidate returns status: "completed", and the run finalizes as a successful refund.
What this changes
Three things on day one of running a real Critic.
Plan-level bugs cannot reach production. A plan that misses evidence, exceeds safety mode, or fails decision-spec coverage gets a typed refusal at the boundary. The class of “the agent did something it should not have because the plan was wrong” stops shipping.
The audit story carries the rationale. Every DecisionRecord has a verify envelope and a score envelope. The auditor asks “why was this run refused?” and gets the typed kind plus the offending step. That is what audit looks like when it is a property of the system, not a story.
The Critic is replayable. Both phases are pure functions of typed inputs. The replay harness runs them on recorded Plans and Runs and gets the same verdicts back. Critic behavior is part of the determinism contract.
Three files. Eighty lines. The Planner is creative; the Executor is mechanical; the Critic is the part that says no when no is the right answer. Wire it up between Plan and Execute and the rest of the harness has something to anchor on.