Foundations

The operating model for ContextOS: five planes, cross-cutting primitives, and the contracts that make agent runs governable.

Living DocumentLast reviewed: 2026-07-11 Edit on GitHub

At a glance

ContextOS foundations are the engineering boundaries around production agents. They define what the runtime knows, what it gives the model, how decisions are made, which actions are allowed, and how every run becomes auditable evidence.

The 2026 framework revision keeps these planes and strengthens their cross-cutting contracts for model routing, durable execution, multimodal evidence, computer use, and scoped agent delegation.

The foundation rule

An agent run is acceptable only when it can answer five questions:

Question	Foundation owner	Runtime artifact
What did the system know?	Intelligence plane	ontology version, CEIDs, graph evidence, promoted memory
What did the model actually see?	Context plane	`ContextPack`, `CompiledContext`, source manifests, budget report
Why did it choose that path?	Decision plane	plan, critic verdicts, decision spec binding
What external effects happened?	Action plane	`ToolEnvelope`, approval mode, idempotency key, tool result
Who approved and how can we replay it?	Trust plane	policy decision, evaluator result, trace id, `DecisionRecord`, replay handle

If one answer is missing, the run is not yet production-grade. It may still be a useful prototype, but it is not a ContextOS-governed run.

The five planes

Intelligence plane

Intelligence

The substrate of meaning: canonical schema, identity, evidence, memory, and retrieval.

Ontology -> Identity -> Graph -> Memory

Context plane

Context

The compiler that turns request state, evidence, policy, tools, and memory into bounded model input.

ContextPack -> CompiledContext

Decision plane

Decision

The bounded loop that plans, critiques, executes, consolidates, and emits typed decision state.

Planner -> Critic -> Executor

Action plane

Action

The governed integration boundary for tools, MCP, A2A, OpenAPI, internal functions, and custom adapters.

ToolGateway -> ToolEnvelope

Trust plane

Trust

The control plane over the other four: policy, approvals, evaluation, tracing, replay, and improvement.

Policy -> Eval -> Replay

The planes compose in one direction: Intelligence feeds Context, Context feeds Decision, Decision drives Action, and Trust wraps every boundary.

How improvement crosses the planes

The latest ContextOS improvement-loop rule is: the harness may be searched, but it must not silently mutate. Autotune, reviewer agents, and human operators can propose changes to any plane, but each proposal must name its target metric, replay set, guardrails, owner, and rollback target.

Plane	What can improve	What must stay invariant
Intelligence	ontology additions, source-priority hints, graph retrieval constraints, memory promotion proposals	CEID stability, source provenance, data classification, snapshot pinning
Context	bucket budgets, retrieval `top_k`, source priority, compression, prompt fragments	required evidence coverage, redaction, policy manifest, tool manifest
Decision	planner templates, tool ordering, re-plan budgets, Critic scoring rubrics, subagent lane limits	DecisionSpec binding, approval gates, loop guards, replayable plan and verdict
Action	adapter retries, circuit breakers, cached read-only aliases, version routing for compatible adapters	schema validation, approval-mode maximum, credential exchange, idempotency keys
Trust	evaluator thresholds, sampling strategy, replay-set composition, rollout gates, proposal ranking	safety and policy floors, human approval for promotion, append-only audit

Every improvement candidate is an artifact, not an edit in place. It enters the same lifecycle as packs, policies, tools, and evaluator suites: proposed -> reviewed -> approved -> released, with rejected and superseded recorded when the proposal does not survive review.

One run through the foundations

Step	What happens	Contract produced	Primary docs
1. Capture	The request is wrapped with tenant, actor, agent, session, budget, safety mode, and trace identity.	`RunContext`	Governance, Identity Layer
2. Ground	The runtime resolves entities, retrieves evidence, and selects eligible memory.	CEIDs, graph evidence, memory candidates	Ontology, Knowledge Graph, Memory Model
3. Compile	The context compiler selects, ranks, redacts, budgets, and assembles the model input.	`CompiledContext`	Cognitive Core, Agentic Context Engineering
4. Decide	Planner, executor, and critic move inside bounded plan and verdict contracts.	plan, critic verdicts, decision binding	Orchestration, Decision Catalog
5. Act	Every external effect goes through the Tool Gateway with policy, approval, identity, and idempotency.	`ToolEnvelope`	Adapter Mesh, Governance
6. Record	The final answer, effects, evidence, approvals, controls, trace, and replay handle are persisted.	`DecisionRecord`	Evaluation and Observability, API Contracts
7. Improve	Failures and corrections become proposals that must pass replay and approval gates before promotion.	scorecard, strategy proposal, pack version	Improvement Loop, Harness Engineering

Cross-cutting primitives

These primitives are deliberately boring: they appear everywhere so every subsystem can be audited the same way.

Primitive	What it carries	Why it matters
`RunContext`	`run_id`, `trace_id`, `session_id`, `tenant_id`, user delegation, agent workload identity, safety mode, run budget	Establishes who is acting, under which authority, with which limits.
`ApprovalMode`	`read_only`, `local_write`, `network`, `delegated`, `destructive`	Makes risk explicit before a tool can be planned or executed.
`ActionRisk`	effect, authority, reversibility, interaction, data scope	Represents risk dimensions that the compatibility approval-mode ladder cannot safely order.
`ContextPack`	Versioned, signed input contract with evidence, policy, tools, memory, and decision layers	Stops prompt stuffing from becoming an undocumented runtime dependency.
`CompiledContext`	Compiled prompt, manifests, omitted context, runtime controls, and budget report	Shows exactly what the model saw and what the compiler excluded.
`ToolEnvelope`	Tool request, tool result, policy decision, approval mode, audit metadata, idempotency, trace context	Turns side effects into governed operations.
`DecisionRecord`	Outcome, evidence refs, approvals, controls active, policy decisions, confidence, trace id, replay handle	Makes a run comparable, searchable, reviewable, and replayable.

The end-to-end contract

invokeAgent(request_envelope, RunContext)
  -> Context plane: compile packs, evidence, tools, policy, memory
  -> Decision plane: plan, verify, execute, score, consolidate
  -> Action plane: route every effect through the Tool Gateway
  -> Trust plane: enforce policy, approvals, evaluation, trace, replay
  -> DecisionRecord(evidence_refs, approvals, controls_active, trace_id, replay_id)

The Intelligence plane feeds the compile step. Consolidation writes memory proposals back through governed promotion, not direct durable writes.

What to read first

If you are evaluating the platform

Invest Early - the business case and late-retrofit failure modes.
Harness Engineering - the discipline behind the product.
Reference Architecture - the full five-plane blueprint.

If you are building the runtime

Cognitive Core - compiler plus bounded execution loop.
Adapter Mesh - governed tool execution.
Governance - policy, approval modes, audit contract.

If you are grounding an agent

Ontology - canonical types and relationship rules.
Identity Layer - CEIDs, SIDs, actor identity.
Knowledge Graph and Memory Model - evidence plus promotion-aware recall.

If you own safety, audit, or release gates

Evaluation and Observability - scorecards, traces, replay.
Improvement Loop - governed change from failures and corrections.
Security and Compliance - sandboxing, identity propagation, compliance map.

Foundation docs by plane

Plane	Foundation docs	Use them when you need to decide
Intelligence	Ontology, Identity Layer, Knowledge Graph, Memory Model	What the system knows, how entities resolve, what evidence is trusted, and what can be remembered.
Context	Cognitive Core, Agentic Context Engineering	What should enter the model input, under which budget, provenance, and runtime controls.
Decision	Orchestration, Cognitive Core	How plans are proposed, verified, executed, scored, retried, or escalated.
Action	Adapter Mesh	Which external capabilities are discoverable, callable, idempotent, and approval-bound.
Trust	Governance, Evaluation and Observability, Improvement Loop, Harness Engineering, Security and Compliance	Which policies apply, how approvals work, what gets evaluated, how replay works, and how the harness improves.

Adoption checklist

Use this before calling a workflow production-ready.

Check	Minimum acceptable answer	Source
Entity model	The workflow’s core entities have ontology types, stable CEIDs, and relationship rules.	Ontology, Identity Layer
Context boundary	The workflow has a versioned `ContextPack`; the compiler emits manifests, omissions, controls, and budgets.	Agentic Context Engineering, Context Pack
Tool boundary	Every external capability is behind the Tool Gateway with schemas, approval mode, idempotency, and trace propagation.	Adapter Mesh
Decision boundary	Outputs bind to a decision spec and produce a typed `DecisionRecord`.	Orchestration, Decision Record, Decision Catalog
Policy boundary	Policy lives outside agent code; approval modes map to the canonical tier taxonomy.	Governance
Evidence boundary	The record contains evidence refs for material claims and tool results.	Knowledge Graph, Evaluation and Observability
Replay boundary	The run can be replayed against pinned context, policy, tools, and evaluator versions.	Evaluation and Observability, Improvement Loop
Improvement boundary	Failures and corrections become proposals, not silent prompt edits.	Harness Engineering, Improvement Loop
Autotune boundary	Any optimizer run declares target metric, guardrails, tunable surfaces, disjoint search/test sets, and rollback target before producing a proposal.	Improvement Loop, Evaluation and Observability

Where the concrete contracts live

The foundations define the operating model. The implementation section defines the concrete contracts:

API Contracts - invokeAgent, ToolCallEnvelope, ToolResultEnvelope, DecisionRecord
Context Pack - pack schema, layers, lifecycle, caching
Decision Catalog - DecisionSpec, DecisionRecord, decision binding
Intent-Task Catalog - intent taxonomy, task templates, risk classification
Memory Fabric - concrete memory storage, promotion, consent, contradiction handling
Workflow Examples - neutral end-to-end runs through the canonical contract
High-Risk Workflow - multi-approver, irreversible, cross-tenant workflow