Skip to content
Back to Blog
Architecture & foundations
April 29, 2026
·by ·10 min read

The Five Planes of Agentic Operating Systems

Share:XBSMRedditHNEmail

A production agent does not fail only when the model gives a bad answer. It fails when the runtime cannot answer five ordinary questions after the run:

  • What did the system know?
  • What did the model actually see?
  • Why did it choose that path?
  • What external effects happened?
  • Who approved it, and can we replay it?

Those questions are the reason ContextOS uses five planes: Intelligence, Context, Decision, Action, and Trust. The planes are not a taxonomy for slides. They are operating boundaries. Each plane owns artifacts that the next plane can inspect, and Trust enforces the controls that make the other four usable in production.

The short version: if a production incident cannot be assigned to one plane and one contract, the architecture is still too implicit.

Why the planes matter now

The industry is standardizing the edges around agents. MCP gives applications a common way to connect models to data sources, tools, and workflows. A2A gives independent agents a common protocol for discovery, task exchange, and artifacts. Those protocols are useful because they make capabilities portable.

Portability also increases the cost of weak boundaries. A tool that is easy to expose is easy to overexpose. Retrieved content that can enter a prompt can also carry instructions. A peer agent that can collaborate can also become an unreviewed decision source. Guidance from OWASP’s LLM Top 10 and NIST AI RMF points in the same direction: trustworthy systems need lifecycle controls, risk ownership, evidence, monitoring, and governance outside model text.

That is the practical job of the five planes. They decide which part of the system owns knowledge, which part compiles it, which part reasons, which part touches the world, and which part enforces authority.

The five planes

Intelligence is the slow-moving substrate of meaning. It owns ontology, entity identity, knowledge graph snapshots, source contracts, embeddings, and promoted memory. New durable facts land here, under provenance and change control. Raw retrieval hits do not automatically become truth.

Context is the per-request compiler. It takes the pinned ContextPack, the request, RunContext, eligible evidence, policy, memory, tools, and budgets, then emits a CompiledContext. This is where source priority, redaction, tool surfacing, truncation, and runtime controls become explicit.

Decision is the bounded execution loop. Planner, Executor, and Critic turn CompiledContext into a plan, verify the plan, run approved steps, score the result, and emit a typed DecisionRecord. The Decision plane may propose action. It does not own policy and it does not directly hold credentials.

Action is the only path to external effects. The Tool Gateway mediates adapters such as MCP, A2A peers, OpenAPI, databases, and internal functions. Every call is schema-validated, identity-bound, approval-mode-bound, idempotent where needed, and traceable through a ToolEnvelope.

Trust is the control plane over the other four. It owns policy decisions, approval-mode tiers, evaluator gates, audit, replay, rollout controls, and improvement promotion. Trust is not a sentence in the system prompt. It is the regime applied at every boundary.

PlaneOwnsDoes not own
Intelligencefacts, schema, identity, memory promotion, evidence snapshotsprompt assembly or policy interpretation
Contextpack compilation, manifests, redaction, tool surfacing, budgetsfinal verdicts or durable memory writes
Decisionplan, critic verdict, loop guard, decision typingcredentials, side effects, or policy source of truth
Actiongateway call, schema validation, idempotency, trace propagationwhich business decision should be made
Trustpolicy, approvals, evaluator gates, replay, promotion controlshidden model reasoning

The “does not own” column is usually where production failures hide. If the Decision plane is making policy choices, the boundary has eroded. If an adapter decides whether a refund is allowed, the Action plane has absorbed Decision and Trust work. If retrieval silently chooses between two conflicting policies, Intelligence has swallowed Context.

The incident shape this prevents

Imagine a support agent tells a customer the refund window is 90 days. The real policy is 14 days. The model did not invent the number. It retrieved a stale wiki page that should not have been eligible for the refund intent.

Without the planes, the postmortem says “the AI gave a bad answer.” With the planes, the failure has a narrower shape:

  • Intelligence: a stale source existed and lacked retirement or contradiction state.
  • Context: the pack allowed the stale wiki to compete with the policy bundle.
  • Decision: the Critic accepted a claim without the required policy evidence.
  • Action: no side effect should proceed unless the Tool Gateway sees the right approval mode and evidence refs.
  • Trust: the scorecard and replay gate should catch the pack regression before rollout.

The fix is no longer “make the prompt stricter.” It is a source contract change, a pack priority rule, a required-evidence check, an evaluator update, or a policy gate.

Why Context deserves its own plane

Most agent architectures fold Context into one of its neighbors. It becomes “retrieval” inside the data layer, or “prompt construction” inside the agent layer. Both shortcuts work for demos. Both break when the same intent runs across tenants, policies, source priorities, memory eligibility rules, and tool permissions.

The Context plane deserves its own name because it answers the question: what is this run allowed to see?

That answer needs to be versioned, testable, replayable, and enforceable.

  • Versioned: ctxpack.support@5.2.0 is a signed artifact, not an undocumented prompt string.
  • Testable: golden sets and replay runs can compare a candidate pack against the current pack before release.
  • Replayable: the compiler is deterministic given the same request, pack, snapshots, and RunContext.
  • Enforceable: the compiled manifest tells the Decision plane which policies, evidence, tools, memory, omissions, and controls were active.

This is why ContextOS treats prompt text as an output, not the contract. The contract is the CompiledContext: compiled prompt, context blocks, manifests, runtime controls, budget report, truncation diagnostics, and lineage. If the model did not see a tool, it cannot call the tool. If policy evidence was omitted, the omission is visible. If memory was excluded because of classification or freshness, that exclusion is inspectable.

Why Trust sits over the other planes

Trust is often drawn as a peer box next to Decision or Observability. That drawing survives a whiteboard. It does not survive an audit.

In production, Trust appears at every boundary:

  • before Context compiles, policy resolves against RunContext;
  • before Decision executes, the Critic checks required evidence, controls, and approval gates;
  • before Action calls a tool, the Tool Gateway re-evaluates schema, identity, approval mode, and policy;
  • before Intelligence accepts memory, promotion gates check provenance, consent, classification, and contradiction state;
  • before a change ships, replay and scorecards decide whether the candidate can be promoted.

The model is allowed to propose. Trust decides whether the proposal can become part of the governed run.

The canonical run through the planes

The same flow as a contract:

invokeAgent(request_envelope, run_context)
  -> compile(packs, request, run_context) -> CompiledContext
  -> loop {
       planner(CompiledContext)         -> Plan
       critic.verify(Plan)              -> ok | replan | reject
       executor(Plan, ToolGateway)      -> step_results, evidence
       critic.score(step_results)       -> accept | retry | replan | escalate
       consolidate(effects, evidence)   -> memory_proposals
     }
  -> DecisionRecord(evidence_refs, approvals, controls_active, trace_id)

Trust is not a separate line in that pseudo-code because it is present at each arrow.

The primitives that carry the architecture

The planes only work if the same primitives show up everywhere. These are the ones worth memorizing:

PrimitiveWhat it proves
RunContextwho is acting, for which tenant, under which delegation, budget, trace, and safety mode
ApprovalModethe maximum side-effect class permitted: read_only, local_write, network, delegated, destructive
ContextPackthe signed, versioned source contract for evidence, policy, tools, memory, and decision layers
CompiledContextexactly what the model received, plus manifests, controls, omissions, and budget diagnostics
ToolEnvelopeevery external effect with schema, identity, approval, policy, idempotency, and trace context
DecisionRecordthe governed outcome: evidence refs, approvals, controls active, policy decisions, trace, replay handle

If a primitive is named, it is a contract. If a contract is broken, the runtime owes an explanation in the DecisionRecord or its linked trace artifacts.

What to build first

You do not need to implement the whole operating system at once. The useful first slice is one high-value workflow with one explicit boundary per plane.

For the first production-grade path, define:

  1. one RunContext shape,
  2. one ContextPack,
  3. one governed adapter behind the Tool Gateway,
  4. one DecisionSpec,
  5. one terminal DecisionRecord,
  6. one replay path that can reproduce the record without live side effects.

That is enough to expose most hidden coupling. The work that follows becomes incremental: more packs, more tools, more evaluators, more memory classes, more approval gates.

Plane health checklist

Use this as the quick architecture review:

PlaneHealthy when
Intelligencefacts have owners, classifications, source contracts, promotion state, contradiction state, and retirement policy
Contextpacks compile deterministically with source priority, bounded buckets, omitted-context diagnostics, and active runtime controls
Decisionplans are verified, loop-guarded, scored, and emitted as typed DecisionRecords
Actionevery effect goes through the Tool Gateway with schema validation, delegated identity, idempotency, approval mode, and trace context
Trustpolicy, approvals, reviewer verdicts, scorecards, replay, rollout gates, and memory promotion are enforced outside model text

The checklist is deliberately operational. It does not ask whether the model is impressive. It asks whether the system can be understood, replayed, and changed without guessing.

The takeaway

The five-plane model is not a claim that every runtime needs five teams or five services. It is a claim about responsibility. Knowledge, compiled input, decision logic, external effects, and governance are different jobs. When they are collapsed into prompts, postmortems turn into archaeology. When they are named as contracts, failures become repairable artifacts.

That is the real value of the architecture: not fewer incidents, but incidents that point to the exact boundary that must change.

Found this useful? Share it.

Share:XBSMRedditHNEmail