Trust Is a Product Surface: Approval Modes and Human Control for Agentic Products

Trust is not a tooltip that says “AI-generated.”

For agentic products, trust is a product surface made of authority, evidence, control, audit, and recovery. If the agent can affect the world, the PM must design how permission works.

The core question is:

What is this system allowed to do, on whose authority, with what evidence, and how can a human stop or correct it?

In ContextOS, this lives in Governance: policy bundles, approval modes, approval gates, evidence requirements, audit records, and DecisionRecords.

The mistake: human-in-the-loop as a checkbox

“Human-in-the-loop” is too vague.

Which human?

At what point?

With what evidence?

Can they edit, reject, approve, undo, or only observe?

Does their decision become training signal?

Does the agent resume against the same frozen facts, or can the world change underneath it?

A PM spec must answer these questions.

Five kinds of human control

Human control is not one thing.

Control type	Product meaning	Example
Preview	Human sees output before user/system impact	review email draft
Edit	Human changes the artifact	revise renewal memo
Approve	Human authorizes a bounded action	approve high-value refund
Override	Human corrects an agent decision	mark customer eligible
Undo / compensate	Human reverses or mitigates an effect	cancel scheduled message

Each control belongs at a different point in the workflow.

Approval modes are the PM risk language

ContextOS uses approval modes as the risk taxonomy:

Mode	Product interpretation	Default PM posture
`read_only`	Looks up or summarizes evidence	Allow with trace
`local_write`	Creates tenant-local drafts or notes	Allow with edit/undo
`network`	Calls external services or fetches external data	Allow with egress policy
`delegated`	Acts on behalf of a user	Require valid delegation and evidence
`destructive`	Irreversible or high-impact effect	Require named approval and frozen evidence

This is much clearer than “low/medium/high risk.” It describes what the agent can actually do.

Design approval as a product moment

An approval screen should not ask:

Approve?

It should show:

Approval field	Why it matters
Proposed action	What will happen if approved
Effective approval mode	Why this gate is required
Evidence snapshot	What facts the agent used
Policy rule	Which rule triggered the gate
Expected side effect	What changes in the world
Idempotency key	How duplicate execution is prevented
Rollback or compensation	What can be undone
Approver identity	Who takes responsibility

The approval is not a button. It is a typed handshake.

Freeze the evidence

For high-risk actions, the approval must bind to a frozen evidence snapshot.

Bad:

Agent asks finance to approve a refund. Ten minutes later it re-runs lookup and refunds a different amount.

Good:

Agent freezes order lookup, policy result, refund amount, customer identity, and proposed tool args. Finance approves that exact packet. The Tool Gateway executes only if the packet hash matches.

This is how approval becomes audit-grade.

The trust matrix PMs should write

For each intent:

Intent	Max mode	Human role	Evidence required	Product fallback
`support.refund.explain`	`read_only`	none	order + policy refs	say what is missing
`support.refund.draft`	`local_write`	editor	order + policy refs	create draft only
`support.refund.execute_small`	`delegated`	customer support rep	identity + order + policy	escalate on mismatch
`support.refund.execute_high`	`destructive`	finance approver	frozen packet	gate required

This table becomes policy, UI, and eval coverage.

Design graceful failure

Agentic products should fail in ways users understand.

Failure	Bad response	Good response
Missing evidence	”I cannot help"	"I need the signed contract or contract ID before I can proceed”
Policy denial	”Denied"	"This cannot be approved because rule R_HIGH_VALUE_REQUIRES_FINANCE applies”
Tool outage	”Something went wrong"	"ERP lookup is unavailable; I saved the packet and will retry or escalate”
Ambiguous authority	Agent guesses	Ask for delegation or route to approver
Safety conflict	Silent refusal	Explain allowed next step without leaking sensitive detail

Graceful failure is not softness. It is operational clarity.

Trust is also user education

Users need the right mental model.

Tell them:

what the agent can do,
what it cannot do,
when it will ask for approval,
what evidence it uses,
how to correct it,
how to inspect a receipt.

Do not overpromise autonomy. A product that says “the agent handles everything” creates the wrong expectations and the wrong escalation behavior.

A worked example: healthcare prior authorization

Goal:

Help clinic staff prepare prior authorization packets and submit low-risk requests.

Trust matrix:

Intent	Mode	Human control
`priorauth.intake`	`read_only`	none
`priorauth.evidence_packet`	`local_write`	edit packet
`priorauth.submit_standard`	`delegated`	staff preview and submit
`priorauth.appeal_denial`	`delegated`	clinician approval
`priorauth.clinical_exception`	`destructive` equivalent	clinician sign-off required

Must-never clauses:

Do not invent clinical facts.
Do not submit without patient identity and payer policy evidence.
Do not override clinician judgment.
Do not expose unrelated PHI in the packet.
Do not retry submission after denial without human review.

PM acceptance criteria:

Every submitted packet has evidence refs.
Every clinician-required case triggers a gate.
Every denial produces a clear next-step packet.
Every correction enters FeedbackStore.

The DecisionRecord is the trust receipt

After a high-risk action, the product should be able to show:

decision_record:
  intent: priorauth.submit_standard
  patient_ref: redacted
  evidence_refs:
    - diagnosis_code_source
    - payer_policy_section
    - treatment_history
  policy_rules:
    - R_PAYER_EVIDENCE_REQUIRED
    - R_CLINICIAN_REVIEW_FOR_EXCEPTION
  approval:
    mode: delegated
    actor: clinic_staff_42
    timestamp: 2026-05-13T09:02:11Z
  tool_effect:
    tool: payer.submit_priorauth
    status: success
  trace_id: trace_pa_921

This receipt is what lets support, compliance, and product review the decision later.

Product metrics for trust

Track:

Metric	Target
Approval gate honored rate	100%
Evidence coverage for gated actions	100%
Audit gap rate	0
Unsupported claim rate	0
Operator correction replay coverage	increasing
User trust recovery after failure	measured by retry/continue rate
Escalation clarity	qualitative review plus time-to-resolution

Trust metrics should be reviewed with the same seriousness as conversion metrics.

PM checklist

Before launch:

Have we assigned max approval mode per intent?
Does every tool declare side-effect class?
Does every high-risk action require evidence refs?
Is the approval packet clear to a non-engineer?
Can approvers reject with rationale?
Does rejection become structured feedback?
Can users inspect what happened?
Can support find the DecisionRecord?
Can the prior harness tuple be restored?

If not, the product is not ready for real authority.