Skip to content
Back to Blog
Enterprise use cases
May 14, 2026
·by ·5 min read

Financial Crime Operations: Agentic AI Needs Evidence, Not Autonomy

Share:XBSMRedditHNEmail

Financial-crime work is full of repetitive investigation, but the judgment is too consequential to hand to an unbounded agent.

The right shape is not “AI closes cases.” The right shape is an evidence-bound casework system: agents gather facts, resolve identity, compare policy, draft narratives, highlight contradictions, recommend disposition, and route high-risk decisions to accountable humans.

That is exactly the Financial Crime Operations playbook in ContextOS.

Why this is agentic-first

KYC, AML, sanctions, and fraud workflows are not single prompts. A case may require customer identity resolution, beneficial ownership, transaction analysis, sanctions screening, adverse-media review, policy interpretation, narrative writing, supervisor review, and regulatory filing.

McKinsey describes agentic AI in financial-crime contexts across client onboarding, KYC checks and refreshes, transaction monitoring, sanctions, and fraud investigations from alert to case closure. The opportunity is real because these workflows are high-volume, evidence-heavy, and cross-system.

The control problem is equally real. A false negative can be regulatory risk. A false positive can create customer harm. A weak narrative can fail supervisory or regulator review. ContextOS narrows the agent’s role: assemble and reason over evidence, then preserve the approval boundary.

The Context Pack

The pack declares the casework boundary:

LayerRequired entries
decision_layerfincrime.alert.triage, fincrime.kyc.refresh, fincrime.case.disposition, fincrime.report.file.
policy_layerAML policy, sanctions policy, fraud policy, customer risk policy, data-retention policy.
approval_gatesGATE_INVESTIGATOR_REVIEW, GATE_MLRO_APPROVAL, GATE_REGULATORY_REPORT.
tooling_layercore banking lookup, KYC fetch, transaction analysis, sanctions screen, case update, report filing.
memory_layercase pattern, policy correction, decision outcome candidates.
evaluation_layerfalse-negative rate, escalation quality, narrative completeness, audit acceptance.

The pack should also pin jurisdiction overlays and retention rules. Financial-crime decisions are not portable across policy context.

Agent roles

AgentResponsibilityBoundary
Alert Triage Agentclusters alerts and resolves customer, account, and transaction identity.cannot close cases.
Evidence Agentfetches KYC, sanctions, transaction, and adverse-media evidence.read-only except case notes.
Policy Agentmaps evidence to policy rules and required case fields.cannot override policy.
Narrative Agentdrafts case summary, rationale, and regulator-ready timeline.draft only.
Supervisor Agentchecks completeness, contradictions, and approval requirements.can block disposition.

The Supervisor Agent is the casework equivalent of the Critic. It does not have to be smarter than every specialist; it has to enforce the contract.

Decision gates

Low-risk false positives may be recommended for batch approval. Suspicious activity, sanctions hits, high-risk jurisdictions, and regulatory filings should stay behind named approval.

What a useful DecisionRecord contains

For financial-crime operations, the DecisionRecord should include:

{
  "decision_key": "fincrime.case.disposition",
  "subject_ids": [
    "ceid_customer_8f4a",
    "ceid_account_1309",
    "ceid_alert_cluster_77"
  ],
  "outputs": {
    "recommended_disposition": "escalate_to_mlro",
    "risk_factors": ["sanctions_name_similarity", "unusual_velocity"],
    "narrative_ref": "artifact_case_narrative_481"
  },
  "evidence_refs": [
    "evidence_kyc_snapshot_11",
    "evidence_txn_graph_91",
    "evidence_sanctions_screen_42"
  ],
  "policy_decisions": [
    "policy_decision_aml_v7_rule_19",
    "policy_decision_sanctions_v4_rule_03"
  ],
  "approvals": [
    "approval_mlro_203"
  ],
  "controls_active": [
    "GATE_MLRO_APPROVAL",
    "NO_AUTO_CLOSE_HIGH_RISK"
  ]
}

The record is not a compliance afterthought. It is how the case gets reviewed, queried, replayed, and improved.

Failure modes to block

FailureControl
Identity collisionCEID/SID resolution proves customer, account, and transaction relationships.
Source conflictcontradictory KYC or sanctions evidence triggers escalation.
Policy version mismatchCompiler refuses packs that mix incompatible policy versions.
Narrative without evidenceSupervisor blocks claims without evidence_refs.
Over-automationhigh-risk disposition remains recommendation until approval.
Memory poisoningcase patterns enter review before promotion.

Metrics that matter

  • case package completeness,
  • false-positive reduction without false-negative increase,
  • investigator acceptance rate,
  • time from alert to first decision,
  • regulatory filing defect rate,
  • supervisor block rate,
  • policy gaps discovered through operator correction,
  • replay pass rate on sampled cases.

The goal is not maximum autonomy. The goal is higher-quality casework with a shorter path from alert to accountable decision.

Research base

Found this useful? Share it.

Share:XBSMRedditHNEmail