Skip to content
Back to Blog
Memory & evidence
April 25, 2026
·by Piyush·6 min read

Promotion-Aware Memory: Capture, Review, Promote, Recall in Code

ContextOS
Harness Engineering
Memory
Promotion
Contradiction Handling
Build-along
Share:XHN

The first version of memory in our agent stack was a thin wrapper over a key-value store. Every conversation turn appended a row. Every compile did a recall. Two weeks in, the model started telling customers about returns policies that had been replaced six months earlier. The recall was fine; the substrate was poisoned.

We did not fix it by remembering less. We fixed it by promoting less. Capture stays append-only and unbounded. Promotion is gated, typed, contradiction-aware, and tier-bound. Recall — the only thing the compiler reads — sees only the promoted slice.

This post is the four-step pipeline in code. The canonical spec is in Memory and Memory Fabric; this post is the operator’s compressed version with schemas you can paste into a database and code that runs.

2026 update: recall is a privilege

The important refinement is to treat recall as a privilege granted only after promotion. Capture is cheap and broad; recall is narrow and governed. A memory candidate can be useful evidence for review without being safe enough to enter the next prompt.

That framing makes memory fit the rest of the harness. The promotion queue is a Trust-plane boundary; promoted memory is compiled by the Context plane; contradictions and retractions become audit artifacts; and replay can explain why a fact was visible on one run but absent on another.

The four-step pipeline

agent / operator / system
        ↓ append-only, immutable
  MemoryCandidate          (capture)
        ↓ dedup + contradiction + classification
  ReviewedCandidate        (review queue)
        ↓ promotion verdict + tier assignment
  PromotedMemory           (the only thing recall reads)
        ↓ tier-aware query
  recall()                 (compiler stage 5)

Four artifacts. Four typed transitions. The compiler at the end never reads MemoryCandidate; promotion is the only path in. That property is the whole reason the pipeline exists.

Step 1 — capture

The capture surface is one append-only function. Anything at all can write to it; nothing about a capture decides what happens next.

harness/memory/capture.ts
export type MemoryCandidate = {
  id: string                       // mc_2026_05_09_a17
  tenant_id: string
  user_id?: string
  intent_id?: string               // optional — what the agent was doing
  source: "agent" | "operator" | "system"
  text: string                     // the candidate fact / preference / observation
  evidence_refs: string[]          // pinned refs that justify this candidate
  classification: string           // "PII" | "INTERNAL" | "PUBLIC"
  captured_at: string
}
 
export async function captureMemory(c: Omit<MemoryCandidate, "id" | "captured_at">): Promise<MemoryCandidate> {
  const candidate: MemoryCandidate = {
    id: `mc_${Date.now()}_${shortid()}`,
    captured_at: new Date().toISOString(),
    ...c,
  }
  // append-only insert — never updates, never deletes
  await db.insert("memory_candidates", candidate)
  return candidate
}

Two properties.

Capture is immutable. No update, no delete. If a candidate is wrong, you write a new candidate that supersedes it (with a supersedes: link in the review queue, not in the candidate itself).

Capture is classification-required. Every candidate carries a data class at write time. Promotion later relies on this; recall later filters on it. A candidate without a classification is a candidate the system cannot reason about safely.

The schema, if you want it in SQL:

CREATE TABLE memory_candidates (
  id              TEXT PRIMARY KEY,
  tenant_id       TEXT NOT NULL,
  user_id         TEXT,
  intent_id       TEXT,
  source          TEXT NOT NULL CHECK (source IN ('agent','operator','system')),
  text            TEXT NOT NULL,
  evidence_refs   JSONB NOT NULL DEFAULT '[]',
  classification  TEXT NOT NULL,
  captured_at     TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX ix_mc_tenant_intent ON memory_candidates (tenant_id, intent_id);
CREATE INDEX ix_mc_classification ON memory_candidates (classification);

Step 2 — review

The review step is where opinion lives. It dedups, detects contradictions with already-promoted memory, scores priority, and produces a typed verdict.

harness/memory/review.ts
import type { MemoryCandidate } from "./capture"
import { listPromotedFor, embeddingOf } from "@/store"
 
export type ReviewedCandidate = {
  candidate_id: string
  status: "pending_promotion" | "duplicate_of" | "contradicts" | "rejected"
  duplicate_of_id?: string         // existing PromotedMemory id
  contradicts_id?: string          // existing PromotedMemory id
  contradiction_resolution?: "supersede" | "coexist" | "block"
  proposed_tier: "working" | "episodic" | "semantic" | "durable"
  priority_score: number           // 0..1, used at recall time
  reviewer: "auto" | "human"
  reviewer_notes?: string
  reviewed_at: string
}
 
const DUP_THRESHOLD = 0.94
const CONTRADICT_THRESHOLD = 0.85
 
export async function reviewCandidate(c: MemoryCandidate): Promise<ReviewedCandidate> {
  const promoted = await listPromotedFor(c.tenant_id, c.user_id)
  const cEmb = await embeddingOf(c.text)
 
  // 1. dedup — any existing promoted memory whose embedding is very close
  const dup = nearest(cEmb, promoted, "text")
  if (dup && dup.similarity >= DUP_THRESHOLD) {
    return basic(c, {
      status: "duplicate_of",
      duplicate_of_id: dup.id,
      reviewer_notes: `dup of ${dup.id} (sim=${dup.similarity.toFixed(3)})`,
    })
  }
 
  // 2. contradiction — semantically similar but assertion differs
  const cand = nearest(cEmb, promoted, "text")
  if (cand && cand.similarity >= CONTRADICT_THRESHOLD
           && asserts(cand.text) !== asserts(c.text)) {
    // newer evidence supersedes older when the candidate's evidence is fresher
    const resolution = candidateIsNewer(c, cand) ? "supersede" : "block"
    return basic(c, {
      status: "contradicts",
      contradicts_id: cand.id,
      contradiction_resolution: resolution,
      reviewer_notes: `contradicts ${cand.id} (sim=${cand.similarity.toFixed(3)})`,
    })
  }
 
  // 3. tier + priority assignment
  return basic(c, {
    status: "pending_promotion",
    proposed_tier: tierFor(c),
    priority_score: priorityFor(c),
  })
}
 
function tierFor(c: MemoryCandidate): ReviewedCandidate["proposed_tier"] {
  if (c.source === "operator") return "durable"
  if (c.intent_id?.startsWith("support.")) return "episodic"
  if (c.classification === "PII") return "working"  // tightest TTL
  return "semantic"
}
 
function priorityFor(c: MemoryCandidate): number {
  const base = c.source === "operator" ? 0.9 : c.source === "system" ? 0.7 : 0.5
  const evidenceBoost = Math.min(0.3, c.evidence_refs.length * 0.05)
  return Math.min(1, base + evidenceBoost)
}
 
function basic(c: MemoryCandidate, x: Partial<ReviewedCandidate>): ReviewedCandidate {
  return {
    candidate_id: c.id,
    status: "pending_promotion",
    proposed_tier: "semantic",
    priority_score: 0.5,
    reviewer: "auto",
    reviewed_at: new Date().toISOString(),
    ...x,
  }
}

Three things this code does that a naive version usually misses.

Duplicates do not block. They get marked duplicate_of and never promoted, but the candidate stays in the table. If the duplicate later turns out to be wrong, the original promotion can be retracted and this candidate becomes a queue entry again. Append-only at every stage.

Contradictions trigger a resolution choice, not a silent overwrite. supersede retracts the older promoted memory and promotes the new one. coexist is for cases where the older fact is still true at a different scope (e.g., per-region). block rejects the new candidate. The default is conservative: block, unless the candidate is fresher and its evidence is stronger.

Tier and priority are derived, not picked. The code maps source/classification/intent to a tier; the code maps source/evidence count to priority. Operators can override, but the default is deterministic. Recalls a month from now reproduce the same scores.

Step 3 — promote

Promotion is a typed write that emits a PromotedMemory row and (optionally) retracts an older one.

harness/memory/promote.ts
import type { ReviewedCandidate } from "./review"
import type { MemoryCandidate } from "./capture"
 
export type PromotedMemory = {
  id: string                     // pm_a17
  candidate_id: string
  tenant_id: string
  user_id?: string
  intent_scope?: string          // intent_id this memory is recall-scoped to (or null = all)
  text: string
  evidence_refs: string[]
  classification: string
  tier: "working" | "episodic" | "semantic" | "durable"
  priority: number
  promoted_at: string
  expires_at?: string            // tier-driven TTL
  retracted_at?: string
  retracted_by?: string          // pm_id that superseded this one
}
 
const TTL_MS: Record<PromotedMemory["tier"], number | null> = {
  working:  60 * 60 * 1000,                    // 1 hour
  episodic: 30 * 24 * 60 * 60 * 1000,          // 30 days
  semantic: 365 * 24 * 60 * 60 * 1000,         // 1 year
  durable:  null,                              // no automatic expiry
}
 
export async function promote(c: MemoryCandidate, r: ReviewedCandidate): Promise<PromotedMemory> {
  if (r.status !== "pending_promotion") {
    throw new Error(`refusing to promote ${c.id}: review status ${r.status}`)
  }
  const now = Date.now()
  const ttl = TTL_MS[r.proposed_tier]
  const pm: PromotedMemory = {
    id: `pm_${shortid()}`,
    candidate_id: c.id,
    tenant_id: c.tenant_id,
    user_id: c.user_id,
    intent_scope: c.intent_id,
    text: c.text,
    evidence_refs: c.evidence_refs,
    classification: c.classification,
    tier: r.proposed_tier,
    priority: r.priority_score,
    promoted_at: new Date(now).toISOString(),
    expires_at: ttl ? new Date(now + ttl).toISOString() : undefined,
  }
  await db.insert("promoted_memory", pm)
  return pm
}
 
export async function supersede(older_pm_id: string, newer_pm_id: string): Promise<void> {
  await db.update("promoted_memory",
    { retracted_at: new Date().toISOString(), retracted_by: newer_pm_id },
    { id: older_pm_id, retracted_at: null }
  )
}

Two design properties.

Promotion is the only insert into promoted_memory. The capture table is the substrate; the promoted table is the contract. Recall queries this table only.

Retraction sets retracted_at instead of deleting. History is preserved; recalls filter WHERE retracted_at IS NULL. This is the pattern that lets the audit trail answer “what did the agent believe at 09:31 last Tuesday?” — every query is timestamp-bound.

The schema:

CREATE TABLE promoted_memory (
  id              TEXT PRIMARY KEY,
  candidate_id    TEXT NOT NULL REFERENCES memory_candidates(id),
  tenant_id       TEXT NOT NULL,
  user_id         TEXT,
  intent_scope    TEXT,
  text            TEXT NOT NULL,
  evidence_refs   JSONB NOT NULL DEFAULT '[]',
  classification  TEXT NOT NULL,
  tier            TEXT NOT NULL CHECK (tier IN ('working','episodic','semantic','durable')),
  priority        NUMERIC(3,2) NOT NULL,
  promoted_at     TIMESTAMPTZ NOT NULL,
  expires_at      TIMESTAMPTZ,
  retracted_at    TIMESTAMPTZ,
  retracted_by    TEXT
);
CREATE INDEX ix_pm_recall ON promoted_memory (tenant_id, user_id, intent_scope, retracted_at, expires_at);
CREATE INDEX ix_pm_priority ON promoted_memory (priority DESC);

Step 4 — recall

The recall query is what the Context Pack Compiler stage 5 calls. It is the only path memory takes into the model:

harness/memory/recall.ts
import type { PromotedMemory } from "./promote"
 
export type RecallRequest = {
  tenant_id: string
  user_id?: string
  intent_id: string
  max_recalls: number              // from pack.memory_layer.recall_policy
  classification_allowed: string[] // from compiled_context.runtime_controls
}
 
export async function recall(req: RecallRequest): Promise<PromotedMemory[]> {
  const now = new Date().toISOString()
  // tenant + user + (intent-scoped OR cross-intent) + not retracted + not expired
  const rows = await db.query(`
    SELECT *
    FROM promoted_memory
    WHERE tenant_id = $1
      AND ($2::text IS NULL OR user_id = $2 OR user_id IS NULL)
      AND (intent_scope IS NULL OR intent_scope = $3)
      AND retracted_at IS NULL
      AND (expires_at IS NULL OR expires_at > $4)
      AND classification = ANY($5)
    ORDER BY priority DESC, promoted_at DESC
    LIMIT $6
  `, [req.tenant_id, req.user_id ?? null, req.intent_id, now,
      req.classification_allowed, req.max_recalls])
 
  return rows
}

Five filters do all the work.

tenant_id — strict per-tenant isolation at the storage layer, not at the API. Cross-tenant recall is structurally impossible.

user_id — recall is user-scoped when a user is identified. Anonymous recalls fall back to tenant-only memories.

intent_scope — memories scoped to a specific intent only surface for that intent. Cross-intent memories surface for everything.

retracted_at IS NULL and expires_at > now — retracted and TTL-expired memories are invisible. Tier-driven TTL is enforced at recall time, with garbage collection a background job.

classification = ANY($5) — the compiler passes in the allowed classifications from the active runtime controls. A run with no PII clearance never sees PII memories, even ones scoped to the user.

Operator review for high-tier memories

Auto-promotion is fine for episodic and semantic tiers. Durable memories — the ones with no TTL — should pass through a human review queue:

harness/memory/operator-queue.ts
export type OperatorReviewItem = {
  candidate_id: string
  proposed_tier: "durable"
  proposed_text: string
  evidence_refs: string[]
  reason: "operator_source" | "user_correction" | "compliance_flag"
  enqueued_at: string
}
 
// in review.ts, after tier assignment:
if (proposed_tier === "durable") {
  // do not promote yet; enqueue for operator review
  await enqueueOperatorReview({
    candidate_id: c.id,
    proposed_tier: "durable",
    proposed_text: c.text,
    evidence_refs: c.evidence_refs,
    reason: "operator_source",
    enqueued_at: new Date().toISOString(),
  })
  return basic(c, { status: "pending_promotion", proposed_tier: "durable", reviewer: "human" })
}

A durable memory with no TTL is the highest-leverage and highest-risk artifact in the memory system. Treating it like a code change — review queue, named approver, recorded verdict — is the discipline that prevents poisoned memory the way the Improvement Loop prevents poisoned policy.

A worked correction

Walking one operator correction end-to-end:

// after the operator overrides a refund decision and types
// "this customer has a verbal NDA on PAN exposure"
await captureMemory({
  tenant_id: "tenant_acme_prod",
  user_id: "cust_8861",
  intent_id: "support.refund.execute",
  source: "operator",
  text: "Customer cust_8861 has a verbal NDA limiting PAN exposure to last-4 only.",
  evidence_refs: ["operator:override:fb_2026_05_09_x9"],
  classification: "PII",
})
// → MemoryCandidate mc_..._x9 lands in capture
 
const reviewed = await reviewCandidate(/* mc_..._x9 */)
// → status: "pending_promotion", tier: "durable", reviewer: "human"
// → enqueued for operator review
 
// operator (different person) approves it
await promote(/* mc */, /* reviewed */)
// → PromotedMemory pm_y3 lands in promoted_memory, tier=durable, priority=0.95
 
// next refund recall for this user surfaces it:
await recall({
  tenant_id: "tenant_acme_prod",
  user_id: "cust_8861",
  intent_id: "support.refund.execute",
  max_recalls: 8,
  classification_allowed: ["PII", "INTERNAL", "PUBLIC"],
})
// → returns [pm_y3, ...] — the agent now redacts PAN to last-4 from the start

What happened end-to-end: one operator correction became a typed, classified, tier-bound, durable memory. Every future refund for this customer compiles with that fact in stage 5 of the Context Pack Compiler. The agent does not need to re-discover the NDA on every conversation; it inherits it.

What this changes

Three things on day one of running this pipeline.

Stale recalls stop happening. TTL-expired memories drop out at recall time. Retracted memories drop out at recall time. Contradictions are explicit verdicts, not silent overwrites. The recall the compiler reads is, by construction, current and uncontradictory.

Memory becomes auditable. Every promoted memory points back to its candidate, its review verdict, and its evidence refs. When the auditor asks “where did this fact come from?”, the trail walks back to a typed candidate write and (often) a typed operator approval.

Privacy guarantees become structural. Classification at capture, classification filter at recall. PII memories never reach a run that does not carry PII clearance. Cross-tenant recall is impossible at the SQL layer, not at the API layer.

Memory readiness checklist

LayerReady when
CaptureCandidates are append-only, classified, evidence-bound, and tenant-scoped.
ReviewDuplicate, contradiction, tier, priority, and reviewer verdicts are recorded.
PromotionPromoted rows carry candidate id, tier, TTL, priority, evidence refs, and retraction fields.
RecallQueries filter by tenant, subject, intent, classification, expiry, and retraction.
Human reviewDurable and sensitive memories require named approval before promotion.
ReplayA past run can explain why a memory was visible, hidden, expired, or retracted.

Five files, two tables, one recall query. The capture is unbounded; the promoted slice is curated; the compiler reads only the curated slice. That is the whole shape, and it is what stops “we set memory to remember everything” from becoming “we are now telling customers stale facts on every conversation.”

Wire the four stages this sprint. Operator-source memories are the easiest first promotion class — high signal, low volume. By the second cycle, the agent has accumulated enough durable memory to noticeably reduce the rate of “I already told you that” turns. That signal is the whole point.

Found this useful? Share it.

Share:XHN
Analytics consent

We use Google Analytics to understand site usage. You can opt in or decline.