Skip to content
Press / to search

Governance

Policy outside agent code, approval-mode tiers, and the audit contract.

Foundational SpecLast reviewed: Edit on GitHub
At a glance
Trust planeread_onlylocal_writenetworkdelegateddestructiveControl over the other four

Policy outside agent code — deterministic enforcement at compile, plan, and execute, bound to approval-mode tiers.

Inputs
  • Versioned policy bundles (JsonLogic rules)
  • Guardrails: must_refuse / must_escalate / redaction_rules
  • Run Context + claims (user, agent, tenant, role)
  • Tool capability declarations bound to an approval mode
Outputs
  • Allow/deny decisions with structured obligations
  • Approval-gate verdicts with frozen evidence snapshot
  • Audit records bound to the DecisionRecord
  • Effective approval mode + approver identity per execution
Lifecycle
  1. intercept
  2. evaluate
  3. decide
  4. gate
  5. audit
Canonical types
  • PolicyBundle
  • ApprovalGate
  • ApprovalMode
  • AuditRecord

Governance is the Trust-plane capability that decides what an agent can do, when, on whose authority, and what evidence the action must produce. It is enforced at a deterministic boundary — never delegated to model self-policing.

Definition

A coordinated set of: policy bundles (versioned JsonLogic rules scoped by intent and risk), guardrails (must_refuse, must_escalate, redaction_rules), approval-mode tiers (the canonical risk taxonomy every tool and decision binds to), approval gates (human checkpoints for high-risk steps), and an audit contract that ties every applied rule and gate decision back to a Decision Record.

Why it exists

Enterprise systems require enforceable controls. Models drift, prompts leak, tools misbehave. Governance makes the runtime safe under change by moving the decision boundary outside the agent — no model can talk its way around a deterministic policy check, an approval-mode tier, or an evidence requirement.

How it works

  1. Policy bundles — versioned rules scoped by intent, risk, and channel.
  2. Guardrails — must-refuse and must-escalate rules that override all other logic.
  3. Approval-mode tiers — every tool capability and decision binding declares the highest mode it can produce.
  4. Approval gates — human authorization for high-risk steps with frozen evidence snapshot.
  5. Enforcement — policies evaluated at compile, plan, and execute stages.
  6. Audit — every applied rule, gate decision, and policy outcome is recorded against the run.

Approval-mode tiers

Approval gates work better when bound to a risk taxonomy rather than to ad-hoc gate names per workflow. The five canonical modes:

ModeExamplesDefault policy
read_onlylookups, search, retrievalallow with audit
local_writetenant-scoped writes that can be reverted in-tenant (notes, drafts, memory)allow with idempotency key + audit
networkoutbound calls, webhooks, third-party readsallow with egress policy + rate budget
delegatedacts on behalf of a user against an external system (booking, message send, calendar write)require valid user delegation token + per-call evidence
destructiveirreversible side effects (payment capture, account deletion, data export)require named approver + frozen evidence snapshot + post-execution audit

Wiring

  • Tool capability declares approval_mode: destructive on the adapter contract.
  • Policy can select a lower effective approval mode for a bounded request when the capability’s declared maximum allows it; it cannot exceed the declared maximum or invent a mode outside the taxonomy.
  • The Decision Catalog records the effective mode and the approver identity per execution, enabling cross-workflow audit by risk class.

Policy outside agent code

Policy decisions never rely on model self-policing.

Boundary enforcement model

  • Intercept every ToolCallEnvelope before adapter execution.
  • Evaluate deterministic allow/deny decisions against current Run Context and claims.
  • Refuse revoked agent registrations, expired or invalid identity claims, tenant mismatches, and child claims broader than the parent or manifest scope ceiling.
  • Enforce parameter-level constraints (required fields, value limits, regex/pattern checks).
  • Return structured denial obligations (escalate, request_approval, collect_evidence) to the Orchestrator.

Policy language and authoring

  • Primary authoring surface: ContextOS policy DSL (JsonLogic-based in current runtime).
  • Optional accelerator: NL-to-policy compilation for draft rules.
  • Required safety checks for generated rules:
    • permissiveness drift (rule expands allowed surface unexpectedly),
    • restrictiveness drift (rule blocks critical safe paths),
    • unsatisfiable conditions (no runtime context can satisfy the rule).

Policy lifecycle

  1. Author — create or update a policy bundle with versioned rules.
  2. Validate — lint for unreachable rules, conflicts, and missing evidence.
  3. Approve — security/governance review for high-risk domains.
  4. Publish — promote to an environment-specific bundle version.
  5. Enforce — apply at compile, plan, and execute stages.
  6. Audit — record applied rules, evidence, and gate outcomes.

Runtime checkpoints

  • Compile-time — policy selection + guardrail activation.
  • Plan-time — verify steps and required evidence/approvals.
  • Execution-time — just-in-time checks with latest context.

Audit expectations

  • Emit a policy_decision_id and matched rule_ids[] for each boundary verdict.
  • Persist input claims, normalized arguments, and decision rationale.
  • Persist agent_identity.subject, agent_identity.claim_hash, principal_chain, kid, and scope summaries for each governed action.
  • Tie every enforced policy outcome to the Decision Record for replay and compliance evidence.
  • Approval-gate decisions persist the approver identity, frozen evidence snapshot hash, and the effective approval mode.

Regulatory timeline checkpoints (EU)

For teams operating in or serving the EU market, governance controls should map to:

  • 2025-02-02: AI literacy obligations and prohibited-practice rules become applicable.
  • 2025-08-02: General-purpose AI (GPAI) obligations apply.
  • 2026-08-02: transparency obligations (including Article 50) broadly apply.

Operationally: policy bundles must encode transparency, disclosure, and human-oversight controls as runtime-enforced requirements, not post-hoc documentation tasks.

Implementation mapping

Governance is implemented primarily by:

Implementation references

Policy bundle structure (example)

{
  "bundle_id": "POLICY_RETURNS_V4",
  "effective_from": "2026-01-01",
  "priority": 10,
  "policy_dsl": {
    "language": "jsonlogic",
    "rules": [
      {
        "rule_id": "R_REFUND_REQUIRES_IDV",
        "applies_to": { "intent": "support.refund" },
        "if": { "==": [{ "var": "request.context.identity_verified" }, true] },
        "then": { "allow": true, "requires": ["order_lookup"] },
        "rationale": "Refunds require verified identity.",
        "citations": ["policy/returns_v4#sec2.1"]
      },
      {
      "rule_id": "R_HIGH_VALUE_REQUIRES_APPROVAL",
      "applies_to": { "intent": "support.refund" },
      "if": {
        "and": [
          { "==": [{ "var": "user.role" }, "support_agent"] },
            { ">": [{ "var": "request.context.refund_amount" }, 3000] }
          ]
        },
        "then": {
          "allow": true,
          "approval_mode": "destructive",
          "requires_approval_gate": "GATE_FINANCE_APPROVAL",
          "arg_constraints": { "refund_amount": { "max": 3000, "unless_approved": true } }
        },
        "decision_binding": "decision.support.refund.execute",
        "rationale": "High-value refunds require finance approval."
      }
    ]
  },
  "prohibited_claims": ["refund_guaranteed"]
}

Approval workflow (runtime)

  1. Gate triggered — policy returns requires_approval_gate.
  2. Context frozen — tool args + evidence snapshot stored.
  3. Approver notified — role-based routing (e.g., fraud, finance).
  4. Decision recorded — approve / deny with rationale.
  5. Resume or halt — execution continues against the frozen evidence or is blocked.

Conflict resolution

  • Priority wins — higher-priority bundles override lower ones.
  • Guardrails first — must-refuse / must-escalate are absolute.
  • Explicit deny beats allow — if two rules conflict, deny unless explicitly overridden by a higher-priority rule.

Interfaces

Inputs

  • Policy DSL bundles and invariants
  • Approval workflows and role definitions
  • Risk classifications per intent or task
  • Run Context (user, agent, tenant, claims)

Outputs

  • Allow/deny decisions with reasons
  • Required evidence references
  • Effective approval mode
  • Audit trails of policy application

Failure modes

  • Policy drift across environments.
  • Missing approval gates for high-risk actions.
  • Evidence requirements not enforced at the right checkpoint.
  • Overly strict policies causing deadlocks.
  • Auto-generated rules with unsatisfiable conditions slipping past validation.

Operational concerns

  • Policy version pinning per environment.
  • Separation of duties for policy changes.
  • Policy evaluation latency budgets.
  • Approval queue SLAs by risk tier.
  • Policy rollback and deprecation windows.
  • Regulatory control mapping to NIST AI RMF and ISO/IEC 42001 control families.

Evaluation metrics

  • Policy compliance rate.
  • Approval latency and rate by tier.
  • Evidence attachment success rate.
  • Audit gap rate (target: zero).
  • Permission-violation rate.

Example

A VIP-instant-refund rule that selects a lower effective approval mode within the capability’s declared maximum:

{
  "rule_id": "R_VIP_INSTANT_REFUND",
  "applies_to": { "intent": "support.refund" },
  "if": {
    "and": [
      { "==": [{ "var": "intent" }, "support.refund"] },
      { "==": [{ "var": "request.context.user.is_vip" }, true] },
      { "<=": [{ "var": "request.context.refund_amount" }, 200] }
    ]
  },
  "then": {
    "allow": true,
    "approval_mode": "delegated",
    "requires_approval_gate": null
  },
  "decision_binding": "decision.support.refund.execute",
  "rationale": "VIP members get instant refunds up to limit; effective mode is delegated for this bounded request."
}

An identity- and supplier-bound execution example:

{
  "rule_id": "R_BOOKING_CANCEL_OWNERSHIP_AND_WINDOW",
  "applies_to": { "intent": "booking.cancel" },
  "if": {
    "and": [
      { "==": [{ "var": "request.context.booking.user_id" }, { "var": "user.user_id" }] },
      { "==": [{ "var": "request.context.supplier.cancel_window_open" }, true] }
    ]
  },
  "then": {
    "allow": true,
    "approval_mode": "delegated",
    "requires": ["supplier_policy_ref", "booking_ownership_proof"]
  },
  "else": { "allow": false, "reason": "supplier_window_or_identity_failed" },
  "decision_binding": "decision.booking.cancel.eligibility"
}

Common misconceptions

  • Governance is not just logging. It is active enforcement at runtime with evidence-bound audit.
  • Approval gates are not a bottleneck when scoped by risk. Most calls are read_only or local_write and never see a gate.
  • Policy is not the model’s responsibility. The model proposes; the boundary decides.
  • Approval-mode tiers are not interchangeable with gate names. Tiers are the canonical risk taxonomy; gate names are runtime artifacts that bind to a tier.