Trust is not a tooltip that says “AI-generated.”
For agentic products, trust is a product surface made of authority, evidence, control, audit, and recovery. If the agent can affect the world, the PM must design how permission works.
The core question is:
What is this system allowed to do, on whose authority, with what evidence, and how can a human stop or correct it?
In ContextOS, this lives in Governance: policy bundles, approval modes, approval gates, evidence requirements, audit records, and DecisionRecords.
The mistake: human-in-the-loop as a checkbox
“Human-in-the-loop” is too vague.
Which human?
At what point?
With what evidence?
Can they edit, reject, approve, undo, or only observe?
Does their decision become training signal?
Does the agent resume against the same frozen facts, or can the world change underneath it?
A PM spec must answer these questions.
Five kinds of human control
Human control is not one thing.
| Control type | Product meaning | Example |
|---|---|---|
| Preview | Human sees output before user/system impact | review email draft |
| Edit | Human changes the artifact | revise renewal memo |
| Approve | Human authorizes a bounded action | approve high-value refund |
| Override | Human corrects an agent decision | mark customer eligible |
| Undo / compensate | Human reverses or mitigates an effect | cancel scheduled message |
Each control belongs at a different point in the workflow.
Approval modes are the PM risk language
ContextOS uses approval modes as the risk taxonomy:
| Mode | Product interpretation | Default PM posture |
|---|---|---|
read_only | Looks up or summarizes evidence | Allow with trace |
local_write | Creates tenant-local drafts or notes | Allow with edit/undo |
network | Calls external services or fetches external data | Allow with egress policy |
delegated | Acts on behalf of a user | Require valid delegation and evidence |
destructive | Irreversible or high-impact effect | Require named approval and frozen evidence |
This is much clearer than “low/medium/high risk.” It describes what the agent can actually do.
Design approval as a product moment
An approval screen should not ask:
Approve?
It should show:
| Approval field | Why it matters |
|---|---|
| Proposed action | What will happen if approved |
| Effective approval mode | Why this gate is required |
| Evidence snapshot | What facts the agent used |
| Policy rule | Which rule triggered the gate |
| Expected side effect | What changes in the world |
| Idempotency key | How duplicate execution is prevented |
| Rollback or compensation | What can be undone |
| Approver identity | Who takes responsibility |
The approval is not a button. It is a typed handshake.
Freeze the evidence
For high-risk actions, the approval must bind to a frozen evidence snapshot.
Bad:
Agent asks finance to approve a refund. Ten minutes later it re-runs lookup and refunds a different amount.
Good:
Agent freezes order lookup, policy result, refund amount, customer identity, and proposed tool args. Finance approves that exact packet. The Tool Gateway executes only if the packet hash matches.
This is how approval becomes audit-grade.
The trust matrix PMs should write
For each intent:
| Intent | Max mode | Human role | Evidence required | Product fallback |
|---|---|---|---|---|
support.refund.explain | read_only | none | order + policy refs | say what is missing |
support.refund.draft | local_write | editor | order + policy refs | create draft only |
support.refund.execute_small | delegated | customer support rep | identity + order + policy | escalate on mismatch |
support.refund.execute_high | destructive | finance approver | frozen packet | gate required |
This table becomes policy, UI, and eval coverage.
Design graceful failure
Agentic products should fail in ways users understand.
| Failure | Bad response | Good response |
|---|---|---|
| Missing evidence | ”I cannot help" | "I need the signed contract or contract ID before I can proceed” |
| Policy denial | ”Denied" | "This cannot be approved because rule R_HIGH_VALUE_REQUIRES_FINANCE applies” |
| Tool outage | ”Something went wrong" | "ERP lookup is unavailable; I saved the packet and will retry or escalate” |
| Ambiguous authority | Agent guesses | Ask for delegation or route to approver |
| Safety conflict | Silent refusal | Explain allowed next step without leaking sensitive detail |
Graceful failure is not softness. It is operational clarity.
Trust is also user education
Users need the right mental model.
Tell them:
- what the agent can do,
- what it cannot do,
- when it will ask for approval,
- what evidence it uses,
- how to correct it,
- how to inspect a receipt.
Do not overpromise autonomy. A product that says “the agent handles everything” creates the wrong expectations and the wrong escalation behavior.
A worked example: healthcare prior authorization
Goal:
Help clinic staff prepare prior authorization packets and submit low-risk requests.
Trust matrix:
| Intent | Mode | Human control |
|---|---|---|
priorauth.intake | read_only | none |
priorauth.evidence_packet | local_write | edit packet |
priorauth.submit_standard | delegated | staff preview and submit |
priorauth.appeal_denial | delegated | clinician approval |
priorauth.clinical_exception | destructive equivalent | clinician sign-off required |
Must-never clauses:
- Do not invent clinical facts.
- Do not submit without patient identity and payer policy evidence.
- Do not override clinician judgment.
- Do not expose unrelated PHI in the packet.
- Do not retry submission after denial without human review.
PM acceptance criteria:
- Every submitted packet has evidence refs.
- Every clinician-required case triggers a gate.
- Every denial produces a clear next-step packet.
- Every correction enters FeedbackStore.
The DecisionRecord is the trust receipt
After a high-risk action, the product should be able to show:
decision_record:
intent: priorauth.submit_standard
patient_ref: redacted
evidence_refs:
- diagnosis_code_source
- payer_policy_section
- treatment_history
policy_rules:
- R_PAYER_EVIDENCE_REQUIRED
- R_CLINICIAN_REVIEW_FOR_EXCEPTION
approval:
mode: delegated
actor: clinic_staff_42
timestamp: 2026-05-13T09:02:11Z
tool_effect:
tool: payer.submit_priorauth
status: success
trace_id: trace_pa_921This receipt is what lets support, compliance, and product review the decision later.
Product metrics for trust
Track:
| Metric | Target |
|---|---|
| Approval gate honored rate | 100% |
| Evidence coverage for gated actions | 100% |
| Audit gap rate | 0 |
| Unsupported claim rate | 0 |
| Operator correction replay coverage | increasing |
| User trust recovery after failure | measured by retry/continue rate |
| Escalation clarity | qualitative review plus time-to-resolution |
Trust metrics should be reviewed with the same seriousness as conversion metrics.
PM checklist
Before launch:
- Have we assigned max approval mode per intent?
- Does every tool declare side-effect class?
- Does every high-risk action require evidence refs?
- Is the approval packet clear to a non-engineer?
- Can approvers reject with rationale?
- Does rejection become structured feedback?
- Can users inspect what happened?
- Can support find the DecisionRecord?
- Can the prior harness tuple be restored?
If not, the product is not ready for real authority.