Intent / Risk Classifier
Decision-plane component that resolves intent + risk_class for the Compiler and Critic.
read_onlylocal_writenetworkdelegateddestructiveBounded execution loopResolves canonical intent + risk class so the Compiler and Critic can size budgets, sampling, and gates correctly.
- RunContext.intent (caller-declared, validated by the classifier)
- invokeAgent.input.message, channel, locale
- invokeAgent.input.context (any structured signals already present)
- Intent catalog snapshot
- Canonical intent_id from the Intent-Task Catalog
- risk_class ∈ read_only / local_write / network / delegated / destructive
- confidence and alternatives[] for the Critic to consider
- IntentId
- RiskClass
- ClassifierVerdict
The Intent / Risk Classifier turns a raw user message (and the Run Context) into a canonical intent name and a risk_class that the Intent-Task Catalog can route on.
Definition
A small, fast classifier (typically a distilled LLM or a deterministic ruleset for known channels) that emits an intent ∈ catalog plus a risk_class ∈ approval-mode tiers. Outputs are injected into the Run Context before policy resolution and tool surfacing.
Why it exists
Without a single classification step, every component re-derives intent from prompts and they drift apart. The classifier centralizes the mapping so the Compiler, Planner, and Critic agree on what the request actually is.
Inputs
RunContext.intent(if the caller declared one — preferred; classifier validates)invokeAgent.input.message,channel,localeinvokeAgent.input.context(any structured signals already present)- Intent catalog snapshot
Outputs
- Canonical
intent_idfrom the Intent-Task Catalog risk_class∈read_only/local_write/network/delegated/destructiveconfidenceandalternatives[]for the Critic to consider
How it works
- If
RunContext.intentis supplied and matches the catalog, validate and return it (preferred path). - Otherwise, run the classifier against the message + channel + locale; produce top-k candidates with confidence.
- Apply tiebreakers: highest specificity, then most recent template version.
- Cross-check the resulting intent’s
risk_classagainstRunContext.safety_mode; refuse if intent risk exceeds safety mode without explicit policy permission.
Failure modes
- Classifier confidence below threshold — the Critic surfaces a clarifying question instead of guessing.
- Intent resolves to a deprecated entry — refuse and emit a typed error.
- Risk-class conflict between intent and safety_mode — refuse before compilation.
Operational concerns
- Classifier model pinned per environment; upgrades go through release-gate evaluation.
- p50 / p99 latency budget folded into the Planner timeout.
- Per-tenant classifier quotas.
- Drift monitoring against golden classification sets.
Evaluation metrics
- Top-1 precision/recall on golden intents.
- Calibration of confidence buckets.
- Refusal rate (clarifying-question routes).
- Risk-class drift detection.