Skip to content
Press / to search

Context Pack Compiler

Context-plane component that turns a pinned Context Pack + RunContext into a CompiledContext.

Reference DesignLast reviewed: Edit on GitHub
At a glance
Context planePer-request compilation

The only Context-plane component — turns a pinned Pack + RunContext into a typed CompiledContext envelope.

Inputs
  • Pinned ContextPack
  • RunContext with RunBudget
  • InvokeRequest (intent, message, channel, locale, structured context)
  • Caller-supplied evidence_refs from the Knowledge Graph
  • Caller-supplied promoted memory recalls
Outputs
  • CompiledContext.compiled_prompt (system, developer, task, context_blocks)
  • CompiledContext.manifests (policy / tool / evidence)
  • CompiledContext.runtime_controls
  • CompiledContext.budget_report (allocations, used_at_compile, bucket_truncations, dropped_block_ids)
  • CompiledContext.context_ledger (pack, request, policy, tool, evidence, memory, budget, hash)
Lifecycle
  1. resolve
  2. retrieve
  3. allocate
  4. render
  5. manifest
Canonical types
  • ContextPack
  • CompiledContext
  • RunContext
  • BudgetReport
  • ContextLedger

Reference Architecture

The Context Pack Compiler is the only Context-plane component. It turns the pinned Context Pack plus the materialized Run Context into the CompiledContext envelope that drives the Decision plane.

Definition

A pure pipeline (no I/O, no clock, no global state) with eight ordered stages. Reads pack + Run Context + caller-supplied evidence + caller-supplied promoted memory; emits compiled_prompt, manifests, runtime_controls, budget_report, and context_ledger.

Why it exists

Without a single deterministic compiler, every team builds its own prompt assembly and drifts. The compiler centralizes the policy/tool/evidence/memory composition so the same pack on the same snapshot always produces the same CompiledContext — the property that makes replay possible.

The eight stages

  1. Intent classification — confirm intent from Run Context against the Intent-Task Catalog.
  2. Policy resolution — JsonLogic over the merged eval context (Run Context + request); fire applicable rules in priority order.
  3. Tool surfacing — compute Registry ∩ Permissions − Prohibitions and apply approval-mode filter against RunContext.safety_mode.
  4. Evidence retrieval — caller supplies retrieved hops from the Knowledge Graph; compiler does not call out.
  5. Memory recall — caller supplies promoted memory only (never raw capture).
  6. Token budget allocation — distribute RunBudget.bucket_tokens across context buckets (default proportional split if not declared).
  7. Bucket assembly — pack each bucket to its budget, sorted by priority; emit truncated flag explicitly per bucket — never silently.
  8. Manifests + runtime controls — emit policy_manifest, tool_manifest, evidence_manifest, runtime_controls (must_refuse, must_escalate, approval_gates_active, redaction_rules_active), and the audit ledger.

Inputs

  • Pinned ContextPack
  • RunContext with RunBudget
  • InvokeRequest (intent, message, channel, locale, structured context)
  • Caller-supplied evidence_refs from the Knowledge Graph
  • Caller-supplied promoted memory recalls

Outputs

  • CompiledContext envelope:
    • compiled_prompt (system, developer, task, context_blocks)
    • manifests (policy_manifest, tool_manifest, evidence_manifest)
    • runtime_controls
    • budget_report (allocations, used_at_compile, bucket_truncations, dropped_block_ids, warnings)
    • context_ledger (pack, request, policy, tool, evidence, memory, budget, hash)

Implementation learning

The reference compiler should expose the audit information that operators need when a compile looks wrong. A downstream implementation surfaced three fields that are worth keeping in the canonical contract:

  • context_ledger: a deterministic audit summary for the compile, including pack ref, request id, policy bundles, surfaced tool identifiers, evidence refs, memory refs, budget summary, and a content hash for replay comparison.
  • budget_report.dropped_block_ids: the exact blocks that did not fit, grouped by bucket. A boolean bucket_truncations flag is useful for dashboards, but it is not enough for debugging.
  • tool_manifest[].capability_metadata: per-capability metadata such as kind, risk_level, approval_mode, and derivation/source. This lets the Critic and UI explain why a tool was surfaced without re-reading the registry.

These are diagnostics, not execution knobs. They should be produced by the compiler from pinned inputs and included in replay fixtures; callers should not mutate them after compile.

How it works (sequence)

RunContext + InvokeRequest + ContextPack
   ↓ classify intent
   ↓ resolve policies (JsonLogic, priority-sorted bundles)
   ↓ surface tools (registry ∩ perms − prohibits, approval-mode filter)
   ↓ assemble blocks (business / policy / tool / evidence / memory / session)
   ↓ pack to bucket budgets (priority-sorted; explicit truncation)
   ↓ emit manifests + runtime_controls + budget_report + context_ledger
CompiledContext

Failure modes

  • Pack pinned by name without version — refuse at the boundary, before the compiler runs.
  • Enforced JsonLogic rule throws — fail closed and emit policy_eval_error. Only rules explicitly marked non_enforcing may be skipped with a diagnostic.
  • Bucket assembly silently truncates — bug; the compiler must always set bucket_truncations[bucket] = true and list the dropped block IDs.
  • Tool surfacing surfaces a capability whose approval_mode exceeds safety_mode — bug; safety filter rejected at write time.
  • Tool manifest omits capability metadata — not a runtime failure, but a diagnosability gap; the Critic can still enforce modes, but operators lose the local explanation.
  • Evidence/memory caller supplied stale data — caught downstream by Critic; not the compiler’s concern.

Operational concerns

  • Cold-start latency dominated by pack registry hydration; cache aggressively.
  • Token estimation discrepancy between compiler and runtime LLM tokenizer; reconcile at the edge.
  • Compile-time idempotency: identical inputs must return identical outputs (tested via replay).
  • Per-pack-version compile-time metrics for regression detection.

Evaluation metrics

  • Compile-success rate per pack version.
  • Compile-time p50 / p99.
  • Bucket-truncation rate per bucket per intent (target: low; alert on spikes).
  • Tool-surface size at compile time per intent (target: minimal sufficient set).
  • Replay determinism (identical CompiledContext on identical inputs).

Reference implementation

src/lib/contextos/compiler.ts is a typed reference implementation of the pipeline. It is not the production runtime but mirrors the contract.