Cognitive Core

The Context + Decision plane runtime that turns requests into governed, auditable execution.

Foundational SpecLast reviewed: 2026-05-09 Edit on GitHub

At a glance

Context planeDecision planePer-request compilation

The Context-plane compiler and Decision-plane loop that turns a request into a typed, replayable verdict.

Inputs

invokeAgent request envelope
Run Context (run_id, trace_id, session_id, tenant, safety_mode, run_budget, user, agent)
Pinned Context Pack version(s)
Knowledge Graph snapshot
Memory state (working / episodic / semantic / durable)

Outputs

CompiledContext (prompt, manifests, runtime_controls, budget_report)
DecisionRecord (decision_id, evidence_refs, approvals, controls_active)
Tool transcripts and evidence manifest
Memory write proposals
Trace bundle

Lifecycle

compile
plan
verify
execute
score
consolidate

Canonical types

CompiledContext
ContextPack
RunContext
DecisionRecord

The Cognitive Core spans the Context plane (the Compiler) and the Decision plane (the cognitive loop). It is the runtime that compiles a bounded Context Pack and then drives the layered loop that produces a Decision Record.

Definition

A two-half runtime: (a) the Context Pack Compiler turns intent + Run Context + pack references into a CompiledContext envelope; (b) the cognitive loop advances a turn through perception, attention, identity, goals, judgment, policy gate, action, consolidation, and reflection under a Run Context with explicit budgets and loop guards.

Why it exists

LLMs are non-deterministic and context-hungry. Without a compiler, every team rolls its own prompt assembly and drifts. Without a layered loop, the runtime cannot be inspected step-by-step or recovered from failure. The Cognitive Core gives the spec a single, layered, replayable boundary between the model and everything else.

How it works

Compile: the Context Pack Compiler assembles intelligence + memory + policy + tools into a bounded CompiledContext.
Loop: the cognitive loop advances one turn at a time, with each layer reading and writing the Run Context.
Decide: the Planner / Executor / Critic triad inside the loop produces a typed Decision Record.
Govern: every layer is policy-aware; effects route through the Tool Gateway under approval-mode tiers.
Observe: every layer emits OTEL spans tied to the run’s trace_id.

Cognitive loop (turn-level layers)

Each turn decomposes into a layered loop. Layers are pure functions over the Run Context: they read it, write structured effects, and never side-step each other.

Layer	Responsibility	Writes into Run Context
Runtime budget gate	gate execution on remaining budget; emit heartbeat	`gate_decision`
Perception	normalize input (channel, locale, attachments, session signals) into typed events	`events[]`
Focus mode	optional deep-work gate that buffers low-salience interrupts	`focus_state`
Attention / Salience	rank events + memory recalls by relevance to current goals; gate what enters working context	`focus`, `salience_scores`
Identity	inject the agent’s role, voice, and non-negotiables; colors every downstream judgment	`agent_self`
Goals / Intent	resolve the active intent against the Intent-Task Catalog and registered objectives	`intent`, `active_goals`
Judgment	the LLM call: produce a candidate plan, decision, or response under the compiled prompt	`candidate`
Policy gate	deterministic policy + guardrail check on the candidate before any side effect	`verdict`, `obligations`
Action	execute approved tool calls through the Tool Gateway with retries and idempotency	`effects[]`
Consolidation	persist evidence, memory write proposals, and trace impressions for the next turn	`proposals[]`
Reflection	post-turn meta-check: did we meet the goal, were budgets honored, what should change next time	`reflection`

The Run Context (run_id, trace_id, session_id, tenant_id, safety_mode, run_budget) flows through every layer so each step is independently traceable and replay-safe. Layers can be skipped only when safety_mode permits it (e.g., read_only mode skips the action layer entirely).

Context Pack Compiler

The compiler is a pure pipeline:

Intent classification — resolve raw input to an intent in the Intent-Task Catalog.
Policy resolution — evaluate JsonLogic rules against the Run Context; produce must_refuse, must_escalate, requires_approval_gate, prohibited_capabilities.
Tool surfacing — intersect Registry ∩ Permissions − Prohibitions and apply approval-mode constraints.
Evidence retrieval — query the Knowledge Graph under hop budgets and freshness windows.
Memory recall — pull promoted memory only (never raw capture) under classification and consent rules.
Token budget allocation — distribute the run budget across context buckets (business, policy, tool, evidence, memory, session).
Bucket assembly — pack each bucket to its allocated tokens; truncate by priority, never silently.
Manifests + runtime controls — emit the compiled_prompt, policy_manifest, tool_manifest, evidence_manifest, and runtime_controls as the CompiledContext.

Triple-check governance

Before any action, the runtime enforces three checks in order:

Permission level — is the tool generally allowed for this agent role?
Rule level — is the tool allowed for this specific intent?
Situational level — is the action safe given the specific evidence (e.g., order.age_days > 30)?

If any check fails, the capability is redacted from the model’s surface — not just rejected at execution.

Turn budgets and loop guards

Each turn declares a budget envelope; the runtime enforces it deterministically rather than trusting the model.

Token budget per Context Pack bucket.
Tool-call budget per turn and per workflow.
Wall-clock budget with heartbeat for long-running sessions.
Loop guard: repeated identical tool calls or no-progress reflection cycles short-circuit with a structured loop_detected reason returned to the planner.
Re-plan budget: capped re-planning attempts to prevent infinite loops.

Budget accumulators are atomic so parallel subagent lanes cannot race.

Functional state and pressure signals

Beyond the deterministic Run Budget, the runtime carries a functional state snapshot per turn — a typed, bounded scalar set that captures the current pressure on the loop. Functional state is observable by the Critic (informs accept / escalate verdicts) and never mutates the policy boundary or model behavior directly.

Signal	Range	Source	Triggers when
`budget_pressure`	0.0 – 1.0	RunBudget accumulators	budget consumed > soft threshold
`loop_pressure`	0.0 – 1.0	repeat-call frequency, replan attempts	loop guard approaching trip
`evidence_pressure`	0.0 – 1.0	unresolved required_evidence ratio	required_evidence resolution stalls
`gate_pressure`	0.0 – 1.0	gate latency vs TTL	approval gate aging beyond soft band
`conflict_pressure`	0.0 – 1.0	unresolved memory contradictions	KG / memory conflicts surface mid-turn
`escalation_propensity`	0.0 – 1.0	aggregate of above with intent risk class	composite signal exposed to the Critic

{
  "functional_state": {
    "captured_at": "2026-05-04T09:31:18Z",
    "budget_pressure": 0.62,
    "loop_pressure": 0.10,
    "evidence_pressure": 0.00,
    "gate_pressure": 0.34,
    "conflict_pressure": 0.00,
    "escalation_propensity": 0.41
  }
}

Properties

Captured at every turn boundary; persisted on the Decision Record’s lineage block.
Read-only for Judgment and Action layers; the model never sees raw scalars.
The Critic may use elevated escalation_propensity to prefer escalate over replan when otherwise tied.
The Improvement Loop treats sustained high-pressure runs as candidate insights.

Implementation mapping

These components and contracts implement the Cognitive Core at runtime:

Implementation references

Interfaces

Inputs

invokeAgent request envelope
Run Context (run_id, trace_id, session_id, tenant_id, safety_mode, run_budget, user, agent)
Pinned Context Pack version(s)
Knowledge Graph snapshot
Memory state (working / episodic / semantic / durable)

Outputs

CompiledContext (prompt, manifests, runtime_controls, budget_report)
Decision Record (decision_id, evidence_refs, approvals, controls_active)
Tool transcripts and evidence manifest
Memory write proposals
Trace bundle

Failure modes

Context overpacked — silent truncation hides evidence; mitigated by always-explicit truncated flag.
Policy resolution caches stale rules; mitigated by version pinning and pack hash check.
Loop guard tripped without informative reason returned to the planner.
Subagent lane mutates parent Run Context (must be isolated copy).
Reflection layer recommends actions that bypass the policy gate on next turn.

Operational concerns

Latency budget per stage; budget-exceeded converts to a Critic escalate verdict.
Pack version pinning per environment; promotion is deliberate.
Tool timeout and retry budgets enforced by the Run Context, not the adapter.
Trace retention by data classification; long-running session checkpoints.
Replay determinism: pinning snapshot + pack + recorded transcripts.

Evaluation metrics

Plan-verification pass rate per intent.
Tool-success rate and recovery rate.
Evidence-backed output rate (decisions with all required_evidence resolved).
Policy-compliance rate.
Mean time to safe completion.
Loop-guard trip rate (should trend toward zero with autotune).

Example

A condensed end-to-end trace of one turn (one cell per layer):

{
  "run_id": "run_a1b2c3d4e5f60718",
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
  "request": "Issue refund for order ord_881",
  "perception":   { "events": [ "user_message", "session_resumed" ] },
  "attention":    { "focus": ["order:ord_881", "policy:returns_v4"], "salience_top": 0.91 },
  "identity":     { "agent_role": "support_agent", "voice": "neutral_concise" },
  "intent":       { "name": "support.refund", "task_template": "refund_with_eligibility" },
  "judgment":     { "candidate_plan_id": "plan_refund_01" },
  "policy_gate":  { "verdict": "allow_with_gate", "approval_gate": "GATE_FINANCE_APPROVAL" },
  "action":       [ { "tool": "orders.lookup", "result": "ok" }, { "tool": "policy.eval", "result": "ok" } ],
  "consolidation":{ "memory_proposals": ["customer_prefers_email_updates"] },
  "reflection":   { "goal_met": true, "budget_remaining": "78%" }
}

Common misconceptions

The Cognitive Core is not a single model. It is a layered runtime with a deterministic compiler, a layered loop, and explicit budgets.
The compiler is not optional. Without it, every team rolls its own prompt assembly and drifts.
Reflection is not introspection theater. It writes a structured record that the Improvement Loop consumes.