Skip to content
Press / to search

Quickstart

Build the smallest useful ContextOS MVP: one workflow, one Context Pack, one governed tool, one typed DecisionRecord, and one replay path.

Living DocumentLast reviewed: Edit on GitHub
At a glance

This quickstart is for building your first ContextOS-governed workflow, not for running this documentation repository.

In one pass you will model a small production agent harness for a support.refund workflow. The goal is not to build a complete platform. The goal is to create the smallest set of ContextOS constructs that makes one agent run explainable, governed, and replayable.

MVP mental model

Do not start with an autonomous agent. Start with one workflow and wrap it in a harness.

user request
  -> RunContext
  -> Context Pack
  -> CompiledContext
  -> Plan + Critic checks
  -> ToolEnvelope through the Tool Gateway
  -> DecisionRecord
  -> Replay packet + improvement proposal

The model is only one participant. ContextOS is the execution environment around it.

Pick one workflow

Use a workflow that is valuable, narrow, and easy to audit. For this quickstart:

ChoiceMVP value
Workflowsupport.refund
User request”Refund order ord_881 for INR 4200.”
Business riskmoney movement, customer impact, fraud exposure
First tooladp_orders.lookup as read_only
Risky tooladp_payments.issue_refund as destructive
Final artifactDecisionRecord for support.refund.execute

Your first workflow should have one primary intent, one or two entities, one safe read tool, one governed write tool, and one explicit decision.

Step 1: Name the business entities

The Intelligence plane starts by making the workflow’s nouns stable. For refund support, the minimum ontology is Customer, Order, and a relationship between them.

{
  "ontology": {
    "namespace": "ont.support",
    "version": "1.0.0",
    "entity_types": ["Customer", "Order"],
    "relationship_types": ["order_belongs_to_customer"]
  },
  "identity_layer": {
    "ceid_namespaces": ["customer", "order"],
    "examples": {
      "customer": "customer:cus_77",
      "order": "order:ord_881"
    }
  }
}

Minimum bar:

CheckYou are done when
Entity identityEvery important object has a stable canonical ID.
Relationship meaningThe runtime can explain how the order belongs to the customer.
Evidence sourceThe order lookup can produce an evidence ref, not just a value.

Read next: Ontology, Identity Layer, Knowledge Graph.

Step 2: Define the RunContext

RunContext is the cross-cutting identity and limit envelope. It travels through every plane.

{
  "run_id": "run_refund_001",
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
  "session_id": "sess_support_42",
  "tenant_id": "tenant_acme_prod",
  "user": {
    "user_id": "usr_771",
    "role": "support_agent",
    "delegation": {
      "auth_type": "oauth2",
      "subject": "u:user:771",
      "scopes": ["orders.read", "payments.refund"],
      "token_ref": "tok_user_7a"
    }
  },
  "agent": {
    "agent_id": "agt_support",
    "role": "support_ops",
    "workload_identity": "spiffe://contextos/agents/support",
    "token_ref": "tok_agent_92"
  },
  "intent": "support.refund",
  "locale": "en-IN",
  "safety_mode": "destructive",
  "run_budget": {
    "total_tokens": 12000,
    "bucket_tokens": {
      "business": 1200,
      "policy": 1800,
      "tool": 1200,
      "evidence": 3500,
      "memory": 1300,
      "session": 3000
    },
    "max_tool_calls": 8,
    "max_replan_attempts": 2,
    "wall_clock_ms": 30000,
    "max_cost_cents": 25
  }
}

Minimum bar:

FieldWhy it exists
run_idLets every artifact belong to one execution.
trace_idPropagates through model calls, policy checks, and tools.
tenant_idPrevents cross-tenant policy, data, and tool leakage.
user.delegationExplains which human authority the agent may use.
agent.workload_identityGives the agent its own service identity.
safety_modeSets the highest action risk this run may produce.
run_budgetMakes loops, cost, latency, and context use bounded.

Read next: API Contracts, Governance.

Step 3: Write the minimal Context Pack

The Context Pack is the versioned input contract. It tells the runtime what the workflow may know, which policies apply, which tools exist, which decision must be produced, and how memory can be used.

{
  "contract_meta": {
    "contract_name": "ctxpack.support",
    "contract_version": "1.0.0",
    "issuer": "tenant_acme_prod",
    "created_at": "2026-05-09T00:00:00Z",
    "compatibility": {
      "requires": { "runtime": ">=1.0.0", "ontology": ">=1.0.0" }
    }
  },
  "pack_meta": {
    "pack_id": "ctxpack.support",
    "pack_version": "1.0.0",
    "tenant": { "tenant_id": "tenant_acme_prod", "name": "Acme Support" },
    "environment_defaults": {
      "language": "en",
      "timezone": "Asia/Kolkata",
      "currency": "INR",
      "region": "ap-south-1"
    },
    "ttl_seconds": 86400,
    "data_classification": "INTERNAL"
  },
  "intelligence_refs": {
    "ontology": {
      "namespace": "ont.support",
      "version": "1.0.0",
      "entity_types": ["Customer", "Order"],
      "relationship_types": ["order_belongs_to_customer"]
    },
    "knowledge_graph": { "snapshot_pin_rule": "alias:prod" },
    "identity_layer": { "ceid_namespaces": ["customer", "order"] }
  },
  "business_context": {
    "summary": {
      "what_we_do": "Post-purchase customer support",
      "who_we_serve": ["customers"],
      "differentiators": ["fast, policy-compliant resolution"]
    },
    "non_negotiables": ["never promise a refund before policy verification"]
  },
  "policy_layer": {
    "policy_bundles": [
      {
        "bundle_id": "POLICY_RETURNS_V4",
        "priority": 10,
        "policy_dsl": {
          "language": "jsonlogic",
          "rules": [
            {
              "rule_id": "R_REFUND_REQUIRES_IDV",
              "applies_to": { "intent": "support.refund" },
              "if": { "==": [{ "var": "request.context.identity_verified" }, true] },
              "then": { "allow": true, "requires": ["order_lookup"] },
              "rationale": "Refunds require verified customer identity."
            },
            {
              "rule_id": "R_HIGH_VALUE_REQUIRES_APPROVAL",
              "applies_to": { "intent": "support.refund" },
              "if": {
                "and": [
                  { "==": [{ "var": "user.role" }, "support_agent"] },
                  { ">": [{ "var": "request.context.refund_amount" }, 3000] }
                ]
              },
              "then": {
                "allow": true,
                "approval_mode": "destructive",
                "requires_approval_gate": "GATE_FINANCE_APPROVAL"
              },
              "decision_binding": "support.refund.execute",
              "rationale": "High-value refunds require finance approval."
            }
          ]
        }
      }
    ],
    "guardrails": {
      "must_refuse": ["refund_without_identity"],
      "must_escalate": ["fraud_signal_high"],
      "redaction_rules": ["pan", "credit_card"]
    },
    "approval_gates": [
      {
        "gate_id": "GATE_FINANCE_APPROVAL",
        "when": { ">": [{ "var": "request.context.refund_amount" }, 3000] },
        "required_approver_role": "finance_lead",
        "ttl_seconds": 7200
      }
    ]
  },
  "tooling_layer": {
    "adapter_registry": [
      {
        "adapter_id": "adp_orders",
        "type": "OPENAPI",
        "endpoint_ref": "internal://orders",
        "capabilities": ["lookup"],
        "approval_mode": "read_only"
      },
      {
        "adapter_id": "adp_payments",
        "type": "OPENAPI",
        "endpoint_ref": "internal://payments",
        "capabilities": ["issue_refund"],
        "approval_mode": "destructive"
      }
    ],
    "permissions": [
      { "permission_id": "p_orders_lookup", "adapter_id": "adp_orders", "capability": "lookup", "allow": true },
      {
        "permission_id": "p_issue_refund",
        "adapter_id": "adp_payments",
        "capability": "issue_refund",
        "allow": true,
        "requires_approval_gate": "GATE_FINANCE_APPROVAL",
        "arg_constraints": {
          "amount_inr": { "min": 1, "max": 50000 },
          "idempotency_key": { "required": true }
        }
      }
    ]
  },
  "decision_layer": {
    "decision_specs": [
      {
        "decision_key": "support.refund.execute",
        "version": "1.0.0",
        "owner_role": "support_ops",
        "required_evidence": ["identity_verified", "order_lookup", "policy_eval"],
        "allowed_outcomes": ["approved", "denied", "escalated"],
        "approval_mode": "destructive",
        "decision_right": "execute"
      }
    ]
  },
  "memory_layer": {
    "memory_policy": {
      "tier_ttls": { "working": "1h", "episodic": "24h", "semantic": "365d", "durable": "1825d" },
      "write_classes_allowed": ["preference", "decision_outcome", "correction"],
      "consent_gating": { "pii_write_back_allowed": false }
    },
    "promotion_thresholds": { "auto_promote_confidence": 0.95 }
  },
  "evaluation_layer": {
    "eval_targets": [
      { "intent": "support.refund", "policy": 1.0, "utility": 0.9, "latency_p99_ms": 3000, "safety": 1.0, "economics_cents_per_decision": 1.5 }
    ]
  },
  "tone_and_comms": {
    "voice_attributes": ["clear", "neutral"],
    "do": ["cite policy", "explain approval status"],
    "dont": ["promise outcomes before approval"]
  }
}

Minimum bar:

LayerMVP requirement
intelligence_refsPinned ontology, graph snapshot rule, and identity namespaces.
policy_layerAt least one allow rule, one guardrail, and one approval gate for risky work.
tooling_layerSafe read tool plus governed write tool with approval mode.
decision_layerOne DecisionSpec with required evidence and allowed outcomes.
memory_layerWrite classes and consent behavior, even if memory writes are disabled at first.
evaluation_layerMinimum score targets for policy, safety, utility, latency, and cost.

Read next: Context Pack.

Step 4: Compile, do not prompt-stuff

The Context plane compiles the pack and request into a CompiledContext. That compiled envelope is what the model receives.

Input request:

{
  "request_id": "req_refund_001",
  "session_id": "sess_support_42",
  "tenant_id": "tenant_acme_prod",
  "context_pack_refs": ["ctxpack.support@1.0.0"],
  "input": {
    "intent": "support.refund",
    "message": "Refund order ord_881 for INR 4200.",
    "channel": "app_chat",
    "locale": "en-IN",
    "context": {
      "identity_verified": true,
      "refund_amount": 4200,
      "order_id": "ord_881"
    }
  },
  "mode": "stream",
  "runtime": {
    "plan_timeout_ms": 15000,
    "max_tool_calls": 8,
    "max_payload_bytes": 1048576,
    "max_cost_cents": 25
  },
  "trace": {
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "span_id": "00f067aa0ba902b7",
    "trace_flags": "01"
  }
}

Compiled output shape:

{
  "compiled_prompt": {
    "system": "You are an agent governed by ContextOS...",
    "developer": "Honor the policy_manifest, tool_manifest, runtime_controls, and required_evidence at every step.",
    "task": "Handle refund request req_refund_001.",
    "context_blocks": [
      { "block_id": "biz_summary", "bucket": "business", "priority": 90 },
      { "block_id": "pol_0", "bucket": "policy", "priority": 80 },
      { "block_id": "tool_0", "bucket": "tool", "priority": 70 },
      { "block_id": "ev_order", "bucket": "evidence", "priority": 60 }
    ]
  },
  "manifests": {
    "policy_manifest": [
      { "bundle_id": "POLICY_RETURNS_V4", "rule_ids": ["R_REFUND_REQUIRES_IDV", "R_HIGH_VALUE_REQUIRES_APPROVAL"] }
    ],
    "tool_manifest": [
      { "adapter_id": "adp_orders", "capabilities": ["lookup"] },
      { "adapter_id": "adp_payments", "capabilities": ["issue_refund"] }
    ],
    "evidence_manifest": [
      { "evidence_ref": "kg:order:ord_881#snapshot_2026_05_09" }
    ]
  },
  "runtime_controls": {
    "must_refuse": ["refund_without_identity"],
    "must_escalate": ["fraud_signal_high"],
    "approval_gates_active": ["GATE_FINANCE_APPROVAL"],
    "redaction_rules_active": ["pan", "credit_card"]
  },
  "budget_report": {
    "tokens_allocated": { "business": 1200, "policy": 1800, "tool": 1200, "evidence": 3500, "memory": 1300, "session": 3000 },
    "tokens_used_at_compile": 620,
    "bucket_truncations": {}
  }
}

Minimum bar:

CheckFailure if missing
policy_manifestYou cannot explain which policy shaped the run.
tool_manifestThe planner may hallucinate or overexpose tools.
evidence_manifestThe final decision cannot cite material facts.
runtime_controlsGuardrails live only in prompt text.
budget_reportContext truncation becomes invisible.

Read next: Cognitive Core, Agentic Context Engineering.

Step 5: Bind the decision before the model acts

The Decision plane does not let the model “just answer.” It binds the run to a typed decision.

{
  "decision_key": "support.refund.execute",
  "version": "1.0.0",
  "required_evidence": ["identity_verified", "order_lookup", "policy_eval"],
  "allowed_outcomes": ["approved", "denied", "escalated"],
  "approval_mode": "destructive",
  "decision_right": "execute"
}

The Planner can propose this plan:

{
  "plan_id": "plan_refund_001",
  "intent": "support.refund",
  "steps": [
    { "id": "s1", "tool": "adp_orders.lookup", "params": { "order_id": "ord_881" }, "approval_mode": "read_only" },
    { "id": "s2", "tool": "policy.eval", "depends_on": ["s1"], "approval_mode": "read_only" },
    {
      "id": "s3",
      "tool": "adp_payments.issue_refund",
      "depends_on": ["s2"],
      "approval_mode": "destructive",
      "requires": ["GATE_FINANCE_APPROVAL"]
    }
  ],
  "decision_checkpoints": [
    { "decision_id": "support.refund.execute", "after_step": "s2" }
  ]
}

The Critic must verify the plan before execution:

Critic checkRequired answer
Is every tool in the tool_manifest?yes
Does every tool argument satisfy schema and constraints?yes
Does the destructive step carry the right approval gate?yes
Does the decision spec have required evidence the plan can satisfy?yes
Does the plan fit run_budget?yes

Read next: Orchestration, Decision Catalog.

Step 6: Route effects through ToolEnvelope

Every external effect goes through the Action plane. A risky action is proposed, approved, and then executed. It is never a direct model-to-API call.

{
  "tool_call_id": "tc_refund_001",
  "request_id": "req_refund_001",
  "session_id": "sess_support_42",
  "tenant_id": "tenant_acme_prod",
  "decision_id": "support.refund.execute",
  "tool": {
    "tool_id": "adp_payments.issue_refund",
    "protocol": "openapi",
    "version": "2026-05-09"
  },
  "args": {
    "order_id": "ord_881",
    "amount_inr": 4200,
    "currency": "INR",
    "idempotency_key": "ik_refund_ord_881_4200"
  },
  "constraints": {
    "policy_bundle_ids": ["POLICY_RETURNS_V4"],
    "arg_schema_ref": "schema://adp_payments.issue_refund.v1",
    "approval_mode_required": "destructive",
    "requires_approval_gate": "GATE_FINANCE_APPROVAL"
  },
  "auth": {
    "delegated_user_token_ref": "tok_user_7a",
    "agent_token_ref": "tok_agent_92"
  },
  "trace": {
    "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
    "parent_span_id": "00f067aa0ba902b7",
    "span_id": "b9c7c989f97918e1"
  }
}

Minimum bar:

BoundaryWhy it matters
Approval modeThe tool call risk is explicit before execution.
Approval gateA human or policy authority can approve the frozen evidence snapshot.
Idempotency keyRetries cannot duplicate side effects.
Delegated and agent authThe call explains both the human authority and agent identity.
Trace contextTool behavior is connected to the original run.

Read next: Adapter Mesh, Governance.

Step 7: Emit the DecisionRecord

The MVP is not done until the run emits a durable DecisionRecord.

{
  "record_id": "dr_refund_001",
  "decision_key": "support.refund.execute",
  "decision_version": "1.0.0",
  "timestamp": "2026-05-09T09:31:42Z",
  "status": "DECIDED",
  "actor": { "type": "AGENT", "id": "agt_support" },
  "intent_ref": "support.refund",
  "subject_ids": ["customer:cus_77", "order:ord_881"],
  "inputs_refs": {
    "request": "req_refund_001",
    "session": "sess_support_42",
    "context_pack": "ctxpack.support@1.0.0",
    "snapshot": "kg:snapshot_2026_05_09"
  },
  "outputs": {
    "outcome": "approved",
    "refund_amount_inr": 4200,
    "transaction_id": "txn_q9"
  },
  "evidence_refs": [
    "kg:order:ord_881#snapshot_2026_05_09",
    "tool:adp_orders.lookup:tc_order_001",
    "policy:POLICY_RETURNS_V4#R_HIGH_VALUE_REQUIRES_APPROVAL",
    "tool:adp_payments.issue_refund:tc_refund_001"
  ],
  "policy_decisions": [
    { "policy_decision_id": "pol_refund_001", "rule_ids": ["R_REFUND_REQUIRES_IDV", "R_HIGH_VALUE_REQUIRES_APPROVAL"] }
  ],
  "approvals": [
    {
      "gate_id": "GATE_FINANCE_APPROVAL",
      "approver": "user_finance_lead_77",
      "approval_mode_effective": "destructive",
      "evidence_snapshot_hash": "sha256:b2a1...",
      "decided_at": "2026-05-09T09:31:30Z"
    }
  ],
  "controls_active": {
    "must_refuse": ["refund_without_identity"],
    "must_escalate": ["fraud_signal_high"],
    "approval_gates_active": ["GATE_FINANCE_APPROVAL"],
    "redaction_rules_active": ["pan", "credit_card"]
  },
  "confidence": 0.95,
  "budget_usage": {
    "tokens": 4720,
    "tool_calls": 3,
    "cost_usd_cents": 0.91,
    "wall_clock_ms": 1840
  },
  "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
  "replay_id": "replay_refund_001"
}

The DecisionRecord should let an operator answer:

Operator questionWhere to look
What did the agent decide?outputs, status, decision_key
What did it rely on?evidence_refs, inputs_refs
Which policy allowed it?policy_decisions, controls_active
Who approved the risky action?approvals
What did it cost?budget_usage
How do we replay it?trace_id, replay_id, pinned pack and snapshot refs

Read next: Decision Record, Decision Catalog, Evaluation and Observability.

Step 8: Create the replay packet

Replay is what turns the MVP from a demo into a harness.

{
  "replay_id": "replay_refund_001",
  "run_id": "run_refund_001",
  "request_ref": "req_refund_001",
  "context_pack_ref": "ctxpack.support@1.0.0",
  "knowledge_snapshot_ref": "kg:snapshot_2026_05_09",
  "policy_bundle_refs": ["POLICY_RETURNS_V4"],
  "tool_transcript_refs": ["tc_order_001", "tc_refund_001"],
  "decision_record_ref": "dr_refund_001",
  "evaluator_set_ref": "eval.support_refund@1.0.0",
  "expected_verdict": "accept"
}

Use replay for three things:

UseWhat changesWhat must stay pinned
DebugInspect the same run after an incident.request, pack, snapshot, policy, tool transcripts
RegressionTest a new pack or policy version against known cases.request and expected verdict
ImprovementCompare a proposed fix against the original failure.evidence and evaluation criteria

Read next: Improvement Loop, Harness Engineering.

MVP acceptance checklist

Do not call the workflow a ContextOS MVP until every row is true.

CheckPass condition
Workflow scopeOne named intent with clear allowed and disallowed outcomes.
Entity modelCore entities have stable IDs and evidence-producing lookups.
Context PackPinned version contains intelligence refs, policy, tools, decision spec, memory rules, and eval targets.
RunContextRun identity, tenant, user delegation, agent identity, safety mode, and budget are explicit.
Compiler outputCompiledContext includes manifests, runtime controls, and budget report.
Decision bindingThe run binds to one DecisionSpec before risky action.
Tool boundaryEvery external effect passes through a ToolEnvelope.
ApprovalRisky actions carry an approval mode and approval gate.
Decision recordFinal output is a typed DecisionRecord, not free-form text.
ReplayRequest, pack, snapshot, policies, tool transcripts, and expected verdict are replayable.
ImprovementFailed or corrected runs become proposals, not silent prompt edits.

Common wrong starts

Wrong startBetter first move
”Let’s make the agent autonomous.”Pick one workflow and one decision.
”Let’s write the perfect system prompt.”Write the Context Pack and compiler output first.
”Let’s connect all tools.”Connect one read tool and one governed write tool.
”Let’s log everything and debug later.”Emit DecisionRecord and replay packet on the first MVP.
”Let’s let the model decide when to escalate.”Put escalation in policy and Critic verdicts.
”Let’s improve the prompt after every miss.”Turn misses into replayed, reviewed improvement proposals.

Follow this order:

  1. How It Works - one request end-to-end through the canonical contract.
  2. Foundations - the operating model behind the five planes.
  3. Context Pack - the schema you deploy.
  4. API Contracts - the runtime envelopes.
  5. Workflow Examples - more complete end-to-end examples.
  6. High-Risk Workflow - approval-heavy, irreversible, cross-tenant work.