Knowledge Substrate
Intelligence-plane component implementing the Knowledge Graph + GraphRAG retrieval surface.
GraphRAG retrieval over a snapshot-pinned Knowledge Graph — every hop carries provenance; conflicts surface, never silenced.
- Ontology + Identity Layer references from the active Context Pack
- Retrieval parameters (mode, seed_strategy, max_hops, top_k, freshness_window, evidence_required)
- Run Context (tenant_id, role, classification scope)
- EvidenceBundle (snapshot_version, ontology_version, seed, hops[], conflicts[], truncated)
- Aggregate confidence and per-hop confidence
- EvidenceBundle
- KGSnapshot
- Hop
- ConflictMarker
The Knowledge Substrate is the runtime component that backs the Knowledge Graph spec. It serves typed evidence bundles to the Compiler under hop budgets, snapshot pinning, and tenant scoping.
Definition
A read-side service for the LPG (typed labeled-property graph). Owns vector / BM25 / graph indices kept in sync against a single snapshot_version; serves GraphRAG retrieval; enforces tenant isolation at the storage layer.
Why it exists
Naive RAG returns opaque chunks. Decisions need typed evidence the Critic can validate and the audit can reproduce. This component is the substrate that produces evidence bundles (hops[], evidence_refs, conflicts, snapshot pin) instead.
Inputs
- Ontology + Identity Layer references from the active Context Pack
- Retrieval parameters from the pack (
mode,seed_strategy,max_hops,top_k,freshness_window,evidence_required) - Run Context (
tenant_id, role, classification scope)
Outputs
EvidenceBundle:snapshot_version,ontology_versionseed,hops[]withevidence_refper hopconflicts[](subject + predicate + values)truncatedflag (always explicit; never silent)
- Aggregate confidence and per-hop confidence
How it works
- Resolves
snapshot_pin_rule→ fixedsnapshot_versionfor replay. - Seeds expansion from the intent + extracted entities.
- Walks permitted relationships under
max_hopsandtop_kconstraints. - Filters by
tenant_id, role, anddata_classificationfrom the Run Context. - Surfaces conflicting facts as
conflict: truemarkers — does not silently resolve. - Returns the typed bundle with
truncatedset explicitly if the budget capped expansion.
Failure modes
- Edge missing
evidence_ref— write rejected at the boundary. - Cross-tenant traversal — denied at the storage layer; emitted as a security event.
- Snapshot stale at read time after writes have advanced — mitigated by snapshot version reconciliation on every read.
- Hop budget truncates a relevant hop — mitigated by
truncated: trueflag the planner must handle. - Vector index drift vs. structured edges — alerting on disagreement; reindex on drift.
Operational concerns
- Per-environment snapshot pinning; promotion deliberate.
- Reindex cost when ontology version advances.
- Eviction of edges past
valid_torequires audit trail. - Backfill jobs replay against ontology versions, not
now.
Evaluation metrics
- Evidence coverage (fraction of decisions whose
evidence_refsresolve). - Retrieval precision/recall on golden sets per intent.
- Conflict rate by
(entity_type, relationship_type). - Cross-tenant denial rate (target: zero in steady state).
- Replay determinism per snapshot pin.