Skip to content
Press / to search

Cognitive Core

The Context + Decision plane runtime that turns requests into governed, auditable execution.

Foundational SpecLast reviewed: Edit on GitHub
At a glance
Context planeDecision planePer-request compilation

The Context-plane compiler and Decision-plane loop that turns a request into a typed, replayable verdict.

Inputs
  • invokeAgent request envelope
  • Run Context (run_id, trace_id, session_id, tenant, safety_mode, run_budget, user, agent)
  • Pinned Context Pack version(s)
  • Knowledge Graph snapshot
  • Memory state (working / episodic / semantic / durable)
Outputs
  • CompiledContext (prompt, manifests, runtime_controls, budget_report)
  • DecisionRecord (decision_id, evidence_refs, approvals, controls_active)
  • Tool transcripts and evidence manifest
  • Memory write proposals
  • Trace bundle
Lifecycle
  1. compile
  2. plan
  3. verify
  4. execute
  5. score
  6. consolidate
Canonical types
  • CompiledContext
  • ContextPack
  • RunContext
  • DecisionRecord

The Cognitive Core spans the Context plane (the Compiler) and the Decision plane (the cognitive loop). It is the runtime that compiles a bounded Context Pack and then drives the layered loop that produces a Decision Record.

Definition

A two-half runtime: (a) the Context Pack Compiler turns intent + Run Context + pack references into a CompiledContext envelope; (b) the cognitive loop advances a turn through perception, attention, identity, goals, judgment, policy gate, action, consolidation, and reflection under a Run Context with explicit budgets and loop guards.

Why it exists

LLMs are non-deterministic and context-hungry. Without a compiler, every team rolls its own prompt assembly and drifts. Without a layered loop, the runtime cannot be inspected step-by-step or recovered from failure. The Cognitive Core gives the spec a single, layered, replayable boundary between the model and everything else.

How it works

  1. Compile: the Context Pack Compiler assembles intelligence + memory + policy + tools into a bounded CompiledContext.
  2. Loop: the cognitive loop advances one turn at a time, with each layer reading and writing the Run Context.
  3. Decide: the Planner / Executor / Critic triad inside the loop produces a typed Decision Record.
  4. Govern: every layer is policy-aware; effects route through the Tool Gateway under approval-mode tiers.
  5. Observe: every layer emits OTEL spans tied to the run’s trace_id.

Cognitive loop (turn-level layers)

Each turn decomposes into a layered loop. Layers are pure functions over the Run Context: they read it, write structured effects, and never side-step each other.

LayerResponsibilityWrites into Run Context
Runtime budget gategate execution on remaining budget; emit heartbeatgate_decision
Perceptionnormalize input (channel, locale, attachments, session signals) into typed eventsevents[]
Focus modeoptional deep-work gate that buffers low-salience interruptsfocus_state
Attention / Saliencerank events + memory recalls by relevance to current goals; gate what enters working contextfocus, salience_scores
Identityinject the agent’s role, voice, and non-negotiables; colors every downstream judgmentagent_self
Goals / Intentresolve the active intent against the Intent-Task Catalog and registered objectivesintent, active_goals
Judgmentthe LLM call: produce a candidate plan, decision, or response under the compiled promptcandidate
Policy gatedeterministic policy + guardrail check on the candidate before any side effectverdict, obligations
Actionexecute approved tool calls through the Tool Gateway with retries and idempotencyeffects[]
Consolidationpersist evidence, memory write proposals, and trace impressions for the next turnproposals[]
Reflectionpost-turn meta-check: did we meet the goal, were budgets honored, what should change next timereflection

The Run Context (run_id, trace_id, session_id, tenant_id, safety_mode, run_budget) flows through every layer so each step is independently traceable and replay-safe. Layers can be skipped only when safety_mode permits it (e.g., read_only mode skips the action layer entirely).

Context Pack Compiler

The compiler is a pure pipeline:

  1. Intent classification — resolve raw input to an intent in the Intent-Task Catalog.
  2. Policy resolution — evaluate JsonLogic rules against the Run Context; produce must_refuse, must_escalate, requires_approval_gate, prohibited_capabilities.
  3. Tool surfacing — intersect Registry ∩ Permissions − Prohibitions and apply approval-mode constraints.
  4. Evidence retrieval — query the Knowledge Graph under hop budgets and freshness windows.
  5. Memory recall — pull promoted memory only (never raw capture) under classification and consent rules.
  6. Token budget allocation — distribute the run budget across context buckets (business, policy, tool, evidence, memory, session).
  7. Bucket assembly — pack each bucket to its allocated tokens; truncate by priority, never silently.
  8. Manifests + runtime controls — emit the compiled_prompt, policy_manifest, tool_manifest, evidence_manifest, and runtime_controls as the CompiledContext.

Triple-check governance

Before any action, the runtime enforces three checks in order:

  1. Permission level — is the tool generally allowed for this agent role?
  2. Rule level — is the tool allowed for this specific intent?
  3. Situational level — is the action safe given the specific evidence (e.g., order.age_days > 30)?

If any check fails, the capability is redacted from the model’s surface — not just rejected at execution.

Turn budgets and loop guards

Each turn declares a budget envelope; the runtime enforces it deterministically rather than trusting the model.

  • Token budget per Context Pack bucket.
  • Tool-call budget per turn and per workflow.
  • Wall-clock budget with heartbeat for long-running sessions.
  • Loop guard: repeated identical tool calls or no-progress reflection cycles short-circuit with a structured loop_detected reason returned to the planner.
  • Re-plan budget: capped re-planning attempts to prevent infinite loops.

Budget accumulators are atomic so parallel subagent lanes cannot race.

Functional state and pressure signals

Beyond the deterministic Run Budget, the runtime carries a functional state snapshot per turn — a typed, bounded scalar set that captures the current pressure on the loop. Functional state is observable by the Critic (informs accept / escalate verdicts) and never mutates the policy boundary or model behavior directly.

SignalRangeSourceTriggers when
budget_pressure0.0 – 1.0RunBudget accumulatorsbudget consumed > soft threshold
loop_pressure0.0 – 1.0repeat-call frequency, replan attemptsloop guard approaching trip
evidence_pressure0.0 – 1.0unresolved required_evidence ratiorequired_evidence resolution stalls
gate_pressure0.0 – 1.0gate latency vs TTLapproval gate aging beyond soft band
conflict_pressure0.0 – 1.0unresolved memory contradictionsKG / memory conflicts surface mid-turn
escalation_propensity0.0 – 1.0aggregate of above with intent risk classcomposite signal exposed to the Critic
{
  "functional_state": {
    "captured_at": "2026-05-04T09:31:18Z",
    "budget_pressure": 0.62,
    "loop_pressure": 0.10,
    "evidence_pressure": 0.00,
    "gate_pressure": 0.34,
    "conflict_pressure": 0.00,
    "escalation_propensity": 0.41
  }
}

Properties

  • Captured at every turn boundary; persisted on the Decision Record’s lineage block.
  • Read-only for Judgment and Action layers; the model never sees raw scalars.
  • The Critic may use elevated escalation_propensity to prefer escalate over replan when otherwise tied.
  • The Improvement Loop treats sustained high-pressure runs as candidate insights.

Implementation mapping

These components and contracts implement the Cognitive Core at runtime:

Implementation references

Interfaces

Inputs

  • invokeAgent request envelope
  • Run Context (run_id, trace_id, session_id, tenant_id, safety_mode, run_budget, user, agent)
  • Pinned Context Pack version(s)
  • Knowledge Graph snapshot
  • Memory state (working / episodic / semantic / durable)

Outputs

  • CompiledContext (prompt, manifests, runtime_controls, budget_report)
  • Decision Record (decision_id, evidence_refs, approvals, controls_active)
  • Tool transcripts and evidence manifest
  • Memory write proposals
  • Trace bundle

Failure modes

  • Context overpacked — silent truncation hides evidence; mitigated by always-explicit truncated flag.
  • Policy resolution caches stale rules; mitigated by version pinning and pack hash check.
  • Loop guard tripped without informative reason returned to the planner.
  • Subagent lane mutates parent Run Context (must be isolated copy).
  • Reflection layer recommends actions that bypass the policy gate on next turn.

Operational concerns

  • Latency budget per stage; budget-exceeded converts to a Critic escalate verdict.
  • Pack version pinning per environment; promotion is deliberate.
  • Tool timeout and retry budgets enforced by the Run Context, not the adapter.
  • Trace retention by data classification; long-running session checkpoints.
  • Replay determinism: pinning snapshot + pack + recorded transcripts.

Evaluation metrics

  • Plan-verification pass rate per intent.
  • Tool-success rate and recovery rate.
  • Evidence-backed output rate (decisions with all required_evidence resolved).
  • Policy-compliance rate.
  • Mean time to safe completion.
  • Loop-guard trip rate (should trend toward zero with autotune).

Example

A condensed end-to-end trace of one turn (one cell per layer):

{
  "run_id": "run_a1b2c3d4e5f60718",
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
  "request": "Issue refund for order ord_881",
  "perception":   { "events": [ "user_message", "session_resumed" ] },
  "attention":    { "focus": ["order:ord_881", "policy:returns_v4"], "salience_top": 0.91 },
  "identity":     { "agent_role": "support_agent", "voice": "neutral_concise" },
  "intent":       { "name": "support.refund", "task_template": "refund_with_eligibility" },
  "judgment":     { "candidate_plan_id": "plan_refund_01" },
  "policy_gate":  { "verdict": "allow_with_gate", "approval_gate": "GATE_FINANCE_APPROVAL" },
  "action":       [ { "tool": "orders.lookup", "result": "ok" }, { "tool": "policy.eval", "result": "ok" } ],
  "consolidation":{ "memory_proposals": ["customer_prefers_email_updates"] },
  "reflection":   { "goal_met": true, "budget_remaining": "78%" }
}

Common misconceptions

  • The Cognitive Core is not a single model. It is a layered runtime with a deterministic compiler, a layered loop, and explicit budgets.
  • The compiler is not optional. Without it, every team rolls its own prompt assembly and drifts.
  • Reflection is not introspection theater. It writes a structured record that the Improvement Loop consumes.