Tutorial 4: Close the improvement loop

Turn an operator correction into a typed Insight, a Strategy rule, and a release-gated promotion.

TutorialLast reviewed: 2026-05-04 Edit on GitHub

At a glance

In Tutorial 3 the runtime emitted a typed DecisionRecord. Now we close the loop: an operator correction lands in the Feedback Store, the Insight Synthesizer clusters it, the Strategy Compiler proposes a reusable StrategyRule, and the proposal goes through the same release gate as a pack change. Nothing auto-applies.

What we are building

ESCALATED DecisionRecord feeds Insight, Strategy, Release Gate

Capture the correction

The operator overrides a refund decision and records why. The FeedbackStore writes a typed entry with provenance.

{
  "feedback_id": "fb_2026_05_04_c19",
  "kind": "correction",
  "context": { "decision_record_id": "dr_2026_05_04_a17", "intent": "support.refund" },
  "operator": "user_finance_lead_77",
  "correction": "Refund eligibility should consider prior corrections within 90 days",
  "applied_to_record": "dr_2026_05_04_a17",
  "evidence_refs": ["dr_2026_05_04_a17"],
  "captured_at": "2026-05-04T09:32:00Z"
}

The same correction also writes a correction-class memory record. It outranks prior records on the same key. See Memory.

Cluster into an Insight

The Insight Synthesizer scans the trace store and the corrections feed. When a pattern crosses a minimum-occurrence threshold, it emits a typed Insight.

{
  "insight_id": "ins_2026_05_04_a17",
  "kind": "failure_cluster",
  "intent": "support.refund",
  "pattern": "policy.eval denial after order_lookup when refund_amount > 800 INR",
  "occurrences": 23,
  "first_seen": "2026-04-21T09:14:00Z",
  "last_seen": "2026-05-04T08:01:00Z",
  "evidence_refs": ["dr_2026_04_21_x12", "dr_2026_05_03_q88"],
  "confidence": 0.91,
  "status": "proposed"
}

status: "proposed" — no human has triaged it yet.

Compile a Strategy rule

The Strategy Compiler converts the validated insight into a typed StrategyRule at the right runtime layer (prompt / planner / retrieval / tool_selection / memory_recall / budget_allocation).

{
  "strategy_rule_id": "str_2026_05_04_b03",
  "applies_to": { "intent": "support.refund" },
  "trigger": { "from_insight": "ins_2026_05_04_a17" },
  "adjustment": {
    "layer": "planner",
    "type": "tool_preference",
    "value": { "prefer_tool": "adp_policy.eval_with_promotion", "over": "adp_policy.eval" }
  },
  "release_gate_target": "support.refund",
  "status": "proposed"
}

Release-gate the proposal

The proposal does not auto-apply. The Evaluation Engine replays a golden set of past support.refund runs through the candidate Strategy and computes scorecard deltas against the current baseline. The release gate’s verdict is one of:

pass — policy >= 1.0, safety >= 1.0, utility non-regressive within tolerance, economics improved or non-regressive.
block — any guardrail metric regressed; proposal stays proposed with the failing metrics surfaced.

Operator approves the gate; the rule transitions proposed → reviewed → approved → released.

Pin and roll out

A released StrategyRule is bound to the next published pack version. Replay against the prior pack version is unaffected. The proposal lifecycle is uniform across all six improvement primitives — Insight Synthesizer, Strategy Compiler, Feedback Store, Chief-of-Staff, Research Queue, Autotune.

proposed → reviewed → approved → released → (retired | superseded)
                  ↘ rejected

Where to next

You have completed the canonical contract end-to-end: pack → compile → plan → execute (with destructive gating) → record → improvement. From here:

Build a real workflow: Workflow Examples walks the same loop on three neutral domains.
Go deep on the runtime: Reference Architecture and the per-component specs under Component Specs.
Operate in production: Deployment Blueprint covers multi-region, replay, SLOs.