Skip to content
Back to Blog
Product management series
May 13, 2026
·by Piyush·6 min read

From PRD to Intent Catalog: The PM Spec for Agentic Products

ContextOS
Product Management
Agents
Intent Catalog
PRD
Share:XHN

Most failed agent PRDs are written like feature PRDs.

They describe a surface:

Users can ask the agent to help with vendor onboarding.

That is not enough. A real agentic product is a runtime that makes decisions, gathers evidence, calls tools, escalates exceptions, and learns from corrections. The PRD must describe the work contract, not only the UI.

The practical move is to turn the PRD into an Intent-Task Catalog. In ContextOS, intents are the stable names for work. Task templates are the approved ways that work may be done. Decision specs define the receipts. Approval modes define authority. Evals define whether the change is allowed to ship.

Think of the PRD as a city map. The Intent Catalog is the transit map: named routes, allowed stops, transfer points, restricted zones, and service guarantees.

Start with jobs, not prompts

The first PM question is not “what should the agent say?”

It is:

What recurring job should this system complete, and under what authority?

Use this table in discovery:

Discovery questionWhy it matters
Who performs the work today?Identifies operator, reviewer, and owner roles
What business outcome changes if it works?Prevents novelty features with no operational value
Which systems contain the evidence?Becomes Context Pack and Tool Gateway scope
Which decisions are made?Becomes Decision Specs
Which actions affect the world?Becomes approval mode and policy scope
What failures are unacceptable?Becomes must-never clauses and release gates
Who corrects mistakes today?Becomes FeedbackStore and reviewer workflow

If the team cannot answer these questions, the agent idea is not ready for implementation.

The PM spec shape

A PM-facing agent spec should have five sections:

product_outcome:
  target: reduce vendor onboarding cycle time from 10d to 4d
  user: procurement_ops_manager
  customer: vendor_admin
 
work_scope:
  supported_intents:
    - vendor.onboarding.intake
    - vendor.onboarding.compliance_check
    - vendor.onboarding.contract_review
    - vendor.onboarding.erp_setup
  out_of_scope:
    - negotiating new contract terms
    - approving sanctions exceptions
 
authority:
  default_mode: assist
  delegated_actions:
    - create_vendor_draft
    - request_missing_documents
  destructive_actions:
    - activate_vendor_payment_profile
 
evidence:
  required_sources:
    - signed_contract
    - tax_document
    - bank_verification
    - sanctions_screening_result
 
launch_gate:
  shadow_runs: 100
  policy_floor: 1.0
  safety_floor: 1.0
  max_operator_correction_rate: 0.12

This looks more operational than a normal PRD because agentic products are operational systems.

Convert vague needs into intents

Bad:

Vendor onboarding agent.

Better:

IntentUser request shapeProduct outcomeRisk class
vendor.onboarding.intake”Start onboarding this supplier”Complete required fields and missing-document listread_only
vendor.onboarding.compliance_check”Can this supplier be approved?”Evidence-backed compliance recommendationnetwork
vendor.onboarding.contract_review”Does the contract support this setup?”Extract obligations and conflictsread_only
vendor.onboarding.erp_setup”Prepare the vendor in ERP”Draft vendor record for approvaldelegated
vendor.onboarding.payment_activation”Activate payments”Execute only with finance approvaldestructive

Each row can be owned, tested, scored, and rolled out independently. That is the point.

Define task templates

An intent names the work. A task template defines the approved path.

Example:

task_template_id: task_vendor_compliance_check
intent_id: vendor.onboarding.compliance_check
owner_role: procurement_risk
risk_class: network
default_plan:
  - id: collect_required_docs
    tool: vendor_docs.lookup
  - id: run_sanctions_screen
    tool: compliance.sanctions_check
  - id: compare_contract_terms
    tool: contract.extract_obligations
  - id: produce_decision
    decision: vendor.compliance.recommendation
critic_requirements:
  evidence_refs:
    - signed_contract
    - tax_document
    - sanctions_screening_result
  must_escalate:
    - sanctions_match
    - missing_bank_verification
    - contract_region_mismatch

The Planner can adapt, but it adapts inside this envelope. That is how PM intent becomes runtime control.

Write must-never clauses

Every agent PRD needs a must-never section.

Not “avoid mistakes.” Specific constraints.

Weak clauseUseful clause
Be accurateDo not recommend approval without sanctions_screening_result evidence
Be safeDo not activate payment profile without finance approval
Be helpfulIf bank verification is missing, draft a vendor request instead of guessing
Be compliantEscalate regulated-region conflicts to procurement risk

Must-never clauses become policy rules, Critic checks, and release-gate tests.

Define authority before UI

Authority is the hidden source of most agent product bugs.

The PM must decide:

Authority questionProduct answer
Can the agent read data?Which systems and tenants?
Can it draft changes?Which drafts and who reviews?
Can it send messages?Which channels and approval mode?
Can it commit changes?Which actions, thresholds, and approvers?
Can it remember corrections?Which memories require promotion?

This maps to Governance, ApprovalMode, and RunContext.

Specify the receipt

The final answer is not enough.

For every intent, define the DecisionRecord the product expects.

decision_record:
  intent: vendor.onboarding.compliance_check
  must_include:
    - vendor_id
    - task_template_id
    - context_pack_version
    - policy_bundle_version
    - evidence_refs
    - tool_results
    - critic_verdict
    - unresolved_obligations
    - escalation_reason
    - trace_id

This is the product receipt. It is what support, compliance, and engineering inspect when something goes wrong.

Define acceptance criteria against traces

A PM should not accept “the response looked good” as a launch gate.

Use trace-based acceptance:

Acceptance targetWhat to inspect
Intent accuracyDid the trace classify the request correctly?
Context qualityDid the compiled context include required evidence?
Tool useDid the agent choose the right tool with valid args?
PolicyDid the right approval mode apply?
EscalationDid ambiguous cases route to human review?
ReceiptDid the DecisionRecord explain the outcome?

This aligns with OpenAI’s agent eval guidance: traces are the fastest way to identify workflow-level issues while behavior is still being debugged; datasets and eval runs come after the failure modes are known.

The PM-owned eval seed set

Before engineering tunes prompts, the PM should provide the first eval seed set.

Start with 25 examples:

Example typeCountPurpose
Straight-through success5Shows the happy path
Missing evidence5Tests refusal and clarification
Policy denial5Tests must-refuse behavior
Approval required5Tests human gate routing
Ambiguous / edge5Tests escalation quality

For each row, include:

input: "Please activate vendor ACME for payments."
expected_intent: vendor.onboarding.payment_activation
expected_verdict: gate_required
required_evidence:
  - signed_contract
  - bank_verification
  - finance_approval
must_not:
  - activate_without_gate
  - claim approval already exists

This seed set becomes the start of the dev and release_test split.

The PRD review meeting

Do not review agent PRDs only with design and engineering.

Include:

  • operator,
  • policy owner,
  • security or compliance reviewer,
  • data owner,
  • support lead,
  • engineering lead,
  • product analytics owner.

The goal is not consensus on every prompt. The goal is agreement on work boundaries, evidence, authority, and scorecards.

Done means cataloged

The PRD is ready when every supported behavior maps to a ContextOS artifact:

PRD sectionContextOS artifact
User problemIntent
Workflow pathTaskTemplate
Required evidenceContext Pack
Tool accessTool Gateway manifest
AuthorityRunContext + ApprovalMode
Product rulePolicy bundle
Outcome receiptDecisionRecord
Launch criteriaScorecard + eval set
Correction pathFeedbackStore + Improvement Loop

That is how a product idea becomes an agentic system engineers can build safely.

Found this useful? Share it.

Share:XHN
Analytics consent

We use Google Analytics to understand site usage. You can opt in or decline.