Skip to content
Press / to search

Knowledge Substrate

Intelligence-plane component implementing the Knowledge Graph + GraphRAG retrieval surface.

Reference DesignLast reviewed: Edit on GitHub
At a glance
Intelligence planeSubstrate of meaning

GraphRAG retrieval over a snapshot-pinned Knowledge Graph — every hop carries provenance; conflicts surface, never silenced.

Inputs
  • Ontology + Identity Layer references from the active Context Pack
  • Retrieval parameters (mode, seed_strategy, max_hops, top_k, freshness_window, evidence_required)
  • Run Context (tenant_id, role, classification scope)
Outputs
  • EvidenceBundle (snapshot_version, ontology_version, seed, hops[], conflicts[], truncated)
  • Aggregate confidence and per-hop confidence
Canonical types
  • EvidenceBundle
  • KGSnapshot
  • Hop
  • ConflictMarker

Reference Architecture

The Knowledge Substrate is the runtime component that backs the Knowledge Graph spec. It serves typed evidence bundles to the Compiler under hop budgets, snapshot pinning, and tenant scoping.

Definition

A read-side service for the LPG (typed labeled-property graph). Owns vector / BM25 / graph indices kept in sync against a single snapshot_version; serves GraphRAG retrieval; enforces tenant isolation at the storage layer.

Why it exists

Naive RAG returns opaque chunks. Decisions need typed evidence the Critic can validate and the audit can reproduce. This component is the substrate that produces evidence bundles (hops[], evidence_refs, conflicts, snapshot pin) instead.

Inputs

  • Ontology + Identity Layer references from the active Context Pack
  • Retrieval parameters from the pack (mode, seed_strategy, max_hops, top_k, freshness_window, evidence_required)
  • Run Context (tenant_id, role, classification scope)

Outputs

  • EvidenceBundle:
    • snapshot_version, ontology_version
    • seed, hops[] with evidence_ref per hop
    • conflicts[] (subject + predicate + values)
    • truncated flag (always explicit; never silent)
  • Aggregate confidence and per-hop confidence

How it works

  1. Resolves snapshot_pin_rule → fixed snapshot_version for replay.
  2. Seeds expansion from the intent + extracted entities.
  3. Walks permitted relationships under max_hops and top_k constraints.
  4. Filters by tenant_id, role, and data_classification from the Run Context.
  5. Surfaces conflicting facts as conflict: true markers — does not silently resolve.
  6. Returns the typed bundle with truncated set explicitly if the budget capped expansion.

Failure modes

  • Edge missing evidence_ref — write rejected at the boundary.
  • Cross-tenant traversal — denied at the storage layer; emitted as a security event.
  • Snapshot stale at read time after writes have advanced — mitigated by snapshot version reconciliation on every read.
  • Hop budget truncates a relevant hop — mitigated by truncated: true flag the planner must handle.
  • Vector index drift vs. structured edges — alerting on disagreement; reindex on drift.

Operational concerns

  • Per-environment snapshot pinning; promotion deliberate.
  • Reindex cost when ontology version advances.
  • Eviction of edges past valid_to requires audit trail.
  • Backfill jobs replay against ontology versions, not now.

Evaluation metrics

  • Evidence coverage (fraction of decisions whose evidence_refs resolve).
  • Retrieval precision/recall on golden sets per intent.
  • Conflict rate by (entity_type, relationship_type).
  • Cross-tenant denial rate (target: zero in steady state).
  • Replay determinism per snapshot pin.