Start here

Read ContextOS by job to be done

ContextOS is easiest to learn through the problem you are trying to solve: govern agents, build the runtime, shape the product, or prove trust.

If you read only one path

Start here for the shortest route from business problem to governed production run.

May 13, 2026·20 min read

AI Agents for Business Leaders: Build the Airport, Not Just the Plane

A practical executive playbook for agentic AI: define the work, evidence, authority, scorecards, approvals, security, observability, and improvement loop.

Read essay

May 9, 2026·18 min read

Agentic AI Systems Before and After ContextOS

A table-first guide to why agentic systems need bounded context, governed tools, typed decisions, replay, evaluation, and controlled improvement.

Read essay

May 9, 2026·21 min read

The Agent Harness Audit: A Production Readiness Checklist for Governed AI Agents

A production readiness audit for agent harnesses: forty-four runtime controls grouped into eight evidence-backed outcomes.

Read essay

May 8, 2026·5 min read

End-to-End Refund: How 12 Primitives Compose in One Production Run

A single refund run traced through 12 ContextOS primitives, from invokeAgent envelope to byte-equal replay.

Read essay

Most useful first shares

July 12, 2026·9 min read

Threat-Model an AI Agent: Sources, Sinks, Authority, and Blast Radius

A practical AI agent threat-modeling method that maps untrusted sources to dangerous sinks, then constrains identity, authority, data, and blast radius at deterministic runtime boundaries.

Read essay

May 19, 2026·33 min read

Agent Harness: An Architectural Framework for Production AI Agents

A whitepaper on typed contracts, policy gates, traces, verification loops, and release control for production AI agents.

Read essay

May 17, 2026·13 min read

Agent Identity Is the New Trust Boundary

A practical model for separating agent identity, workload proof, user delegation, scoped authority, and audit across MCP and A2A.

Read essay

May 9, 2026·21 min read

The Agent Harness Audit: A Production Readiness Checklist for Governed AI Agents

A production readiness audit for agent harnesses: forty-four runtime controls grouped into eight evidence-backed outcomes.

Read essay

For engineering leaders

Architecture, harness quality, runtime boundaries, and the build path from prototype to governed agent.

May 16, 2026·28 min read

ContextOS: A Research-Grounded Architecture for Governed Agent Runtimes

A research-grounded framing of ContextOS as a governed runtime for context, tools, memory, security, evaluation, replay, and optimization.

Read essay

May 12, 2026·13 min read

How Great AI Engineers Build Agents: Datasets, Scores, and Harnesses That Improve

Why strong AI engineers build datasets, scorecards, traces, and improvement loops instead of treating agents as prompts plus tools.

Read essay

May 5, 2026·5 min read

Build the Tool Gateway: The Boundary That Actually Stops a Bad Action

A build-along for the Tool Gateway: adapter manifests, typed envelopes, resolver checks, dispatch, and destructive-action boundaries.

Read essay

May 9, 2026·6 min read

Replay Harness in Code: Reproducing a DecisionRecord Byte-for-Byte

A TypeScript build-along for replay: input loading, hash-chain verification, canonical loop replay, and DecisionRecord diffing.

Read essay

For product managers

How to turn agent ideas into intents, launch gates, scorecards, trust surfaces, and operating loops.

May 13, 2026·18 min read

Product Managers: How to Think About and Build Complex Agentic Systems

A practical PM guide to building agentic systems with workflow maps, intents, context packs, tools, records, evals, and rollout gates.

Read essay

May 13, 2026·6 min read

From PRD to Intent Catalog: The PM Spec for Agentic Products

How PMs turn vague agent ideas into intent catalogs, task templates, authority models, DecisionRecords, and launch criteria.

Read essay

May 13, 2026·6 min read

Scorecards Before Screens: Evals and Launch Gates for PMs Building Agents

A PM guide to defining agent quality with datasets, trace reviews, scorecards, release gates, and business metrics before building the agent UI.

Read essay

May 13, 2026·6 min read

Operating Agent Products: Feedback, Rollout, and the Improvement Loop

A PM operating model for shipped agents: trace review, corrections, proposal queues, scorecards, rollout, and rollback.

Read essay

For executives and operators

Plain-English mental models for authority, scorecards, approvals, and feedback after launch.

May 13, 2026·20 min read

AI Agents for Business Leaders: Build the Airport, Not Just the Plane

A practical executive playbook for agentic AI: define the work, evidence, authority, scorecards, approvals, security, observability, and improvement loop.

Read essay

May 23, 2026·12 min read

The Autonomy Budget: How Enterprises Should Decide What AI Agents Are Allowed to Do

A practical governance model for granting AI agents bounded authority based on risk, evidence, policy confidence, evals, and approval.

Read essay

May 13, 2026·4 min read

Trusting AI at Work: Approvals, Boundaries, and Receipts

A plain-English guide to agent trust: what AI can read, draft, send, change, approve, and how receipts make decisions accountable.

Read essay

May 13, 2026·8 min read

AI Does Not Launch Once: Feedback Loops After Go-Live

A plain-English guide to operating agents after launch: corrections, recurring failures, proposal queues, rollout, rollback, and review.

Read essay

For security and compliance

Identity, prompt-injection boundaries, approval tiers, replay, and audit controls for high-authority agents.

July 12, 2026·9 min read

Threat-Model an AI Agent: Sources, Sinks, Authority, and Blast Radius

A practical AI agent threat-modeling method that maps untrusted sources to dangerous sinks, then constrains identity, authority, data, and blast radius at deterministic runtime boundaries.

Read essay

May 17, 2026·13 min read

Agent Identity Is the New Trust Boundary

A practical model for separating agent identity, workload proof, user delegation, scoped authority, and audit across MCP and A2A.

Read essay

July 12, 2026·8 min read

Secure the MCP and Tool Supply Chain: Trust Must Be Continuous

How to secure MCP servers, remote tools, connectors, skills, and their outputs with admission controls, audience-bound authorization, least privilege, runtime containment, and revocation.

Read essay

July 12, 2026·9 min read

Red-Team Agent Hijacking: Build a Security Eval Gate for Repeat Attacks

A practical agent-hijacking evaluation harness: scenario design, adaptive and repeated attempts, path-aware metrics, deterministic release gates, and production replay.

Read essay