Skip to content
Press / to search

Policy Engine

Trust-plane component that evaluates JsonLogic rules and emits structured policy decisions.

Reference DesignLast reviewed: Edit on GitHub
At a glance
Trust planeread_onlylocal_writenetworkdelegateddestructiveControl over the other four

Evaluates priority-sorted JsonLogic policy bundles + guardrails; emits PolicyDecisions with effective approval mode and obligations.

Inputs
  • PolicyBundle[] from the active Context Pack (priority-sorted)
  • Guardrails (must_refuse, must_escalate, redaction_rules) from the pack
  • Eval context: RunContext + InvokeRequest + structured request context
  • Optional: tool capability declaration (for execute-time checks)
Outputs
  • PolicyDecision (allowed, reason, effective_approval_mode, requires_approval_gate, requires[], prohibits[], policy_decision_id, matched_rule_ids[])
  • Audit metadata persisted to the trace and the DecisionRecord
Canonical types
  • PolicyBundle
  • PolicyDecision
  • ApprovalMode
  • Guardrail

Reference Architecture

The Policy Engine is the deterministic boundary for every governed action. It evaluates JsonLogic rules from the active policy bundles and emits typed PolicyDecision envelopes that the Compiler, Critic, and Tool Gateway consume.

Definition

A pure function over (rules, eval_context) → PolicyDecision. Stateless. Replayable. Never delegated to the model.

Why it exists

Model self-policing is not a security boundary. The Policy Engine moves the decision boundary outside the agent so allow/deny, required evidence, prohibited claims, and approval-mode escalation are deterministic.

Inputs

  • PolicyBundle[] from the active Context Pack (priority-sorted)
  • Guardrails (must_refuse, must_escalate, redaction_rules) from the pack
  • Eval context: RunContext + InvokeRequest + structured request context
  • Optional: tool capability declaration (for execute-time checks)

Outputs

  • PolicyDecision:
    • allowed: bool
    • reason: string
    • effective_approval_mode?: ApprovalMode
    • requires_approval_gate?: string
    • requires[] (evidence references the action must produce)
    • prohibits[] (capabilities the rule disallows)
    • policy_decision_id
    • matched_rule_ids[]
  • Audit metadata persisted to the trace and the Decision Record

How it works

  1. Evaluates Guardrails first — must_refuse short-circuits everything.
  2. Iterates policy bundles in priority order; evaluates each rule’s if against the eval context.
  3. For each fired rule, captures then (allow, requires, prohibits, approval mode, gate).
  4. Resolves conflicts: higher-priority bundle wins; explicit deny beats allow at equal priority.
  5. Computes effective approval mode (max of declared modes, capped at the bundle’s allowed range — never upgrades outside priority).
  6. Emits policy_decision_id and matched_rule_ids[] for audit.

Checkpoints in the runtime

  • Compile-time — guardrail activation, rule selection.
  • Plan-time — Critic verifies plan steps against rules and decision_binding.
  • Execute-time — Tool Gateway re-evaluates per ToolCallEnvelope with the latest context.

Failure modes

  • Enforced rule throws on malformed JsonLogic — fail closed and emit policy_eval_error. Only rules explicitly marked non_enforcing may be skipped with a diagnostic.
  • Two equal-priority rules conflict — explicit deny wins; otherwise refuse.
  • Approval-mode upgrade outside bundle priority — reject at lint time, not runtime.
  • Stale rule cache after a bundle promotion — mitigated by content-addressed bundle hashing.

Operational concerns

  • Policy bundle versioning per environment; promotion is deliberate.
  • Per-rule evaluation latency budget; alert on slow rules.
  • Lint-on-merge: unreachable rules, unsatisfiable conditions, missing decision_binding.
  • Separation of duties for policy authoring vs. approval.

Evaluation metrics

  • Policy compliance rate (fraction of runs with no guardrail violation).
  • Rule-level fire rate; alert on rules that never or always fire.
  • Mean time from policy authored → enforced.
  • Approval-mode misclassification rate.