MCP is a useful protocol. It gives agent hosts a standard way to discover tools, resources, prompts, transports, authorization, elicitation, sampling, and long-running tasks.
That does not make an MCP server production-safe by default.
In ContextOS, MCP sits behind the Adapter Mesh and the Tool Gateway. The model never calls an MCP tool directly. The model proposes a step; the Gateway resolves the capability from the compiled tool manifest, validates arguments, checks policy, applies approval-mode routing, brokers credentials, executes the adapter, and records the result.
The production rule
A production adapter must answer five questions without reading a prompt:
| Question | Required proof |
|---|---|
| What can this adapter do? | capability manifest with tools, resources, prompts, schemas, protocol version, transport, and owner. |
| How risky is each operation? | capability class and declared maximum approval_mode. |
| Who can invoke it? | auth contract, scopes, tenant rules, RunContext claims, and policy bindings. |
| What evidence does it return? | structured output, evidence refs, audit metadata, retry and error semantics. |
| Can it be replayed or audited later? | toolCall, toolResult, W3C trace context, idempotency, DecisionRecord linkage, version pins. |
If the answer is “the model knows” or “the prompt says,” the adapter is not ready.
MCP object types have different trust semantics
MCP exposes several surfaces. ContextOS should not treat them all as tools.
| MCP surface | Use it for | ContextOS treatment |
|---|---|---|
| Tool | model-invocable operation | map to a governed capability with schema, approval mode, and policy. |
| Resource | readable artifact or template | treat as evidence or context with source, classification, and access control. |
| Prompt | repeatable prompt artifact | version and evaluate; never treat as authority. |
| Root | filesystem boundary | maximum path boundary, not broad permission. |
| Sampling | server-requested model call | route through AI Gateway budget and policy controls. |
| Elicitation | user input or auth handoff | consent-bound, rate-limited, recorded in lineage. |
| Task | durable request state | map to resumable tool execution and deferred DecisionRecord states. |
The official MCP task specification is useful because it gives long-running work a state model: working, input_required, completed, failed, and cancelled. ContextOS should still bind those tasks to authorization context, trace ids, timeouts, approval gates, and DecisionRecords.
The adapter manifest
MCP discovery tells you what the server says it can do. The ContextOS manifest tells production what is allowed to depend on it.
{
"adapter_id": "adp_payments_mcp",
"protocol": "mcp",
"protocol_version": "2025-11-25",
"transport": {
"kind": "streamable_http",
"endpoint_ref": "https://payments.example.com/mcp"
},
"auth": {
"type": "oauth2",
"initial_scopes": ["payments.read"],
"step_up_scopes": {
"payments.issue_refund": ["payments.refund.write"]
},
"allow_token_passthrough": false
},
"capabilities": [
{
"capability_id": "payments.lookup_transaction",
"mcp_tool_name": "payments_lookup_transaction",
"capability_class": "observe",
"approval_mode": "read_only",
"decision_bindings": ["support.refund.eligibility"],
"output_schema_ref": "schema://payments.lookup_transaction.output.v1",
"evidence_keys": ["transaction_id", "status", "settlement_ref"]
},
{
"capability_id": "payments.issue_refund",
"mcp_tool_name": "payments_issue_refund",
"capability_class": "act",
"approval_mode": "destructive",
"requires_approval_gate": "GATE_FINANCE_APPROVAL",
"idempotency": {
"required": true,
"dedup_window_seconds": 86400
},
"evidence_keys": ["refund_id", "transaction_id", "reversal_token"]
}
]
}The manifest is deliberately boring. Boring is good. It lets a reviewer diff risk.
Narrow tools beat broad tools
Bad MCP tool:
{
"name": "payments_request",
"description": "Call the payments API",
"inputSchema": {
"type": "object",
"properties": {
"method": { "type": "string" },
"path": { "type": "string" },
"payload": { "type": "object" }
}
}
}Good MCP tools:
[
{
"name": "payments_lookup_transaction",
"description": "Read a settled transaction by id.",
"approval_mode": "read_only"
},
{
"name": "payments_issue_refund",
"description": "Issue a refund against an existing settled transaction.",
"approval_mode": "destructive"
}
]The second design lets ContextOS expose lookup without exposing refund. It lets the Critic require evidence before refund. It lets the Tool Gateway require idempotency only where a write can happen. It lets policy bind payments.issue_refund to a finance approval gate.
Task handling for long-running tools
Long-running work is where weak adapters usually leak state. A production MCP task should produce a ContextOS state transition:
If task state is accessible by guessing a task id, the adapter is unsafe. If task state can outlive the run without retention policy, the adapter is unauditable. If task cancellation does not preserve audit metadata, incident response will have a blind spot.
Threats the manifest should make visible
OWASP’s MCP Top 10 project frames MCP-specific risks such as model misbinding, scope creep, context spoofing, prompt-state manipulation, insecure memory references, and contextual prompt injection. ContextOS maps those risks into manifest checks:
| Risk | Manifest control |
|---|---|
| Scope creep | per-capability approval mode and scoped credentials. |
| Tool confusion | stable capability_id, owner, version, and schema refs. |
| Prompt injection through tool descriptions | treat descriptions as untrusted UI text; authority lives in policy. |
| Insecure memory references | resources carry classification, source id, and access rules. |
| Hidden model calls | sampling routes through AI Gateway policy and budget. |
| Long-running task leakage | task ids bound to auth context, TTL, trace, and audit. |
The point is not to avoid MCP. The point is to put MCP behind the same production boundary as every other adapter.
Adapter readiness checklist
- Each capability is named by business operation, not raw endpoint.
- Read and write operations are separate tools.
- Every input and output has JSON Schema.
- Every write-class operation requires idempotency.
- Every capability declares max
approval_mode. - Auth uses least-privilege step-up scopes; token passthrough is off by default.
- Resources carry source, MIME type, size, classification, and evidence identity.
- Sampling, elicitation, and tasks have explicit policy bindings.
- Gateway tests cover protocol, schema, auth, policy, replay, and evaluator behavior.
Research base
- ContextOS spec: Build an Adapter, Model Context Protocol, Adapter Mesh, and Tool Manager.
- MCP official docs: Tasks, Tools, Resources, and Authorization.
- OWASP MCP Top 10 for MCP-specific security failure modes.