Approval-Mode Tiers: A Risk Taxonomy You Can Actually Ship

A team I work with shipped an AI assistant for their support org last summer. The pilot went well. The second team adopted it in October. By December the slack channel was an anthropology project: someone needed an approval step before the agent emailed customers, someone else needed one before refunds, a third before file uploads, a fourth before contract changes — but only above a threshold, and only outside business hours, and only for VIP accounts.

By March they had thirty named approval gates. Pull requests to add the thirty-first were the team’s most-disliked Jira ticket. Audit reports took a week. Every new use case made the problem worse, not better.

This is not a tooling problem. It is a taxonomy problem. The team was binding approvals to workflows when they should have been binding them to a risk class.

2026 update: tiers belong in manifests

The refinement since this was first written is that approval mode should appear in every artifact that can change risk: adapter manifests, compiled tool surfaces, critic verdicts, gateway envelopes, approval events, and Decision Records. A tier that exists only in policy prose will drift. A tier emitted into the runtime contract can be enforced, queried, replayed, and reviewed.

The five canonical tiers

ContextOS uses exactly five approval-mode tiers. Every adapter capability declares the highest tier it can produce; every policy bundle can downgrade within its priority but cannot upgrade.

Mode	Examples	Default policy
`read_only`	lookups, search, retrieval	allow with audit
`local_write`	tenant-scoped writes that can be reverted in-tenant (notes, drafts, memory)	allow with idempotency key + audit
`network`	outbound calls, webhooks, third-party reads	allow with egress allow-list + rate budget
`delegated`	acts on behalf of a user against an external system (booking, message send, calendar write)	require valid user delegation token + per-call evidence
`destructive`	irreversible side effects (payment capture, account deletion, data export)	require named approver + frozen evidence snapshot + post-execution audit

Five tiers. Not fifty. The constraint is the entire point.

Where the tier is enforced

The tier is not just metadata for the UI. It is checked at each place where risk can change shape.

Runtime point	What it checks	Artifact emitted
Adapter registration	Capability declares its maximum possible `approval_mode` and constraints	Signed manifest entry
Context compile	Compiler surfaces only tools allowed by pack, policy, and `RunContext.safety_mode`	`CompiledContext.tool_manifest`
Critic verify	Proposed plan stays within surfaced tools, argument bounds, and effective mode	`critic_verdict`
Tool Gateway	Execute-time policy re-check, credential scope, idempotency, and egress	`toolCall` / `toolResult`
Approval gate	Human or delegated authorization binds to frozen evidence	`approval_event`
Decision Record	Final action records declared mode, effective mode, policy rule, and approver	`DecisionRecord.approvals[]`

Why workflow-named gates rot

When you name gates after workflows, three things happen. They are obvious in retrospect; they are very hard to see in the moment.

First, the taxonomy explodes. Every new use case wants a name, every name acquires a special case, and within a year you have GATE_REFUND_FRAUD_VIP_TIER2_HOLIDAY_OVERRIDE. Auditors do not query that schema; they archaeologize it.

Second, the same business action carries different risk in different contexts, and a single gate name flattens that risk. A 50-rupee refund and a 5-million-rupee refund are not the same risk event. The team that named GATE_REFUND_APPROVAL had to either accept that ergonomic mistake or invent yet another gate to fix it. They invented one.

Third, cross-workflow comparison is impossible. “Show me every destructive action this quarter, by approver, ranked by override rate” is one line of SQL against a tier taxonomy. It is a research project against a workflow-name taxonomy. The compliance org rapidly learns to stop asking.

A risk taxonomy fixes all three because the tier is canonical and gate names are runtime artifacts that bind to a tier. You can still keep gate names for routing — GATE_FINANCE_APPROVAL is fine as a runtime artifact — but the audit story queries the tier.

The downgrade-only invariant

The single most important rule in the model is that policy bundles can downgrade an effective approval mode but cannot upgrade past the wire-time declaration.

Imagine the alternative. A buggy or compromised policy bundle could escalate any read into a destructive action. The blast radius of a runtime mistake would scale with the number of bundles you operate. With downgrade-only, the worst a runtime mistake can do is under-protect a call — never over-empower one. That is the difference between “we can recover” and “we have an incident.”

The wire-time contract on the adapter pins the ceiling. A capability for issuing refunds declares itself destructive once, in its registration. The runtime can decide to downgrade that for VIP customers under a threshold, with a recorded rule, but cannot push it the other way. Even a fully compromised model proposing the most permissive call possible cannot escape the ceiling.

A walkthrough

The intent is support.refund. The wire-time fact is that adp_payments.issue_refund is destructive:

{
  "permission_id": "perm_payments_refund_capped",
  "adapter_id": "adp_payments",
  "capability": "issue_refund",
  "approval_mode": "destructive",
  "arg_constraints": {
    "amount_inr": { "min": 1, "max": 50000 },
    "currency": { "enum": ["INR"] },
    "idempotency_key": { "required": true, "pattern": "^ik_[a-z0-9]{16}$" }
  }
}

A VIP-instant-refund rule downgrades it within bundle priority for low amounts:

{
  "rule_id": "R_VIP_INSTANT_REFUND",
  "applies_to": { "intent": "support.refund" },
  "if": {
    "and": [
      { "==": [{ "var": "request.context.user.is_vip" }, true] },
      { "<=": [{ "var": "request.context.refund_amount" }, 200] }
    ]
  },
  "then": { "allow": true, "approval_mode": "delegated", "requires_approval_gate": null }
}

For a 200-rupee refund to a VIP, the effective mode is delegated. No human gate, but the Decision Record still names the rule that produced the downgrade and the evidence that satisfied it. If an auditor asks why a destructive capability auto-executed, the answer takes seconds: the rule, the evidence, the bundle priority.

For a 50,000-rupee refund the rule would not match, the wire-time destructive would stand, and execution would route through propose → approve → execute with a frozen evidence snapshot. Same capability, different effective mode, complete audit either way.

What changes operationally

Once the taxonomy is canonical, the dashboard rewrites itself. Compliance rate, gate honor rate, latency-to-approval, override rate — all of these become per-tier metrics that compare across workflows. The first chart you’ll want is “destructive-mode share of total calls, by tenant, over time”; an unexpected uptick is the most reliable single indicator of a workflow that’s drifting toward riskier behavior than intended.

Two metrics worth watching specifically:

Cross-tier downgrade rate. High values are not necessarily bad — VIP rules and small-amount carve-outs are legitimate downgrades. They are, however, a leading indicator of misclassified wire-time. If 80% of destructive calls get downgraded, the wire-time tier is probably wrong.
Approver review time per evidence_snapshot_hash. If the median approver spends three seconds on a frozen snapshot, the gate has become a rubber stamp; either the threshold needs to move down, or the snapshot needs more context.

Reviewer checklist

Before a new capability ships, the reviewer should be able to answer this without reading the prompt:

Question	Block if
What is the wire-time maximum approval mode?	The adapter registration is missing a canonical tier.
Can policy only constrain the capability?	A rule can make the call more powerful than its manifest ceiling.
Are arguments bounded?	A side-effecting capability lacks schema limits, idempotency, or destination constraints.
Is the evidence snapshot frozen before approval?	The approver sees live, mutable data instead of a hash-addressed snapshot.
Can audit query by tier across workflows?	The only searchable concept is a workflow-specific gate name.

What people get wrong

The two arguments against this model are both wrong, and they are wrong for the same reason.

The first is: “five tiers will not be enough.” It is enough because tiers are about risk class, not use case. Use cases multiply; risk classes do not. The team I mentioned at the top eventually retired all thirty of their gate names; the runtime now binds the same five tiers across every workflow, and the gate names are routing artifacts.

The second is: “this adds latency.” It does not — most calls are read_only or local_write and never see a gate. The gates kick in only for calls that should pay the latency. If your destructive-mode share is non-trivial, your problem is the workflow, not the taxonomy.

Both arguments come from the same place, which is the assumption that an approval taxonomy is something the governance team imposes on the engineering team. It is the opposite. The taxonomy is what lets the engineering team ship without auditing every new gate by hand. It is the constraint that makes velocity possible.

A closing thought

Risk taxonomies are unglamorous. They get out-voted by “let’s just add a gate.” Six months later, the team that resisted the taxonomy has the worst audit experience in the company.

If you want governance you can ship, pick the canonical five and bind every adapter capability to one. Let policy bundles downgrade within their priority. Never let them upgrade past the wire-time ceiling. The rest of the program — the audit reports, the cross-workflow queries, the dashboards your CISO actually reads — falls out of those two rules.