Agents
17 essays tagged with Agents.
AI Does Not Launch Once: Feedback Loops After Go-Live
A plain-English guide to operating agents after launch: corrections, recurring failures, proposal queues, rollout, rollback, and review.
How to Judge AI Work: Scorecards, Not Vibes
A practical guide for business teams evaluating AI agents with scorecards, examples, traces, human corrections, and launch gates instead of demos and vibes.
Trusting AI at Work: Approvals, Boundaries, and Receipts
A plain-English guide to agent trust: what AI can read, draft, send, change, approve, and how receipts make decisions accountable.
Before Your Team Asks for an AI Agent, Map the Real Work
A practical guide for business teams mapping real work before building agents: actors, evidence, tools, decisions, risks, exceptions, and feedback loops.
AI Agents for Business Leaders: Build the Airport, Not Just the Plane
A practical executive playbook for agentic AI: define the work, evidence, authority, scorecards, approvals, security, observability, and improvement loop.
Operating Agent Products: Feedback, Rollout, and the Improvement Loop
A PM operating model for shipped agents: trace review, corrections, proposal queues, scorecards, rollout, and rollback.
Trust Is a Product Surface: Approval Modes and Human Control for Agentic Products
How PMs should design trust for real agentic products: approval modes, human roles, evidence snapshots, DecisionRecords, policy gates, and graceful failure.
Scorecards Before Screens: Evals and Launch Gates for PMs Building Agents
A PM guide to defining agent quality with datasets, trace reviews, scorecards, release gates, and business metrics before building the agent UI.
The Control Tower Pattern: How PMs Should Design Multi-Agent Products
A PM guide to splitting multi-agent systems into specialist lanes while keeping orchestration governed and inspectable.
From PRD to Intent Catalog: The PM Spec for Agentic Products
How PMs turn vague agent ideas into intent catalogs, task templates, authority models, DecisionRecords, and launch criteria.
Product Managers: How to Think About and Build Complex Agentic Systems
A practical PM guide to building agentic systems with workflow maps, intents, context packs, tools, records, evals, and rollout gates.
How to Develop an Agent with an Agent Harness, End to End
An end-to-end field guide for building agents as measurable harnesses: context, planning, tools, records, evals, rollout, and learning.
Dataset-First Agent Engineering: The Golden Sets Behind Reliable Agents
A practical guide to golden sets, task distributions, corrected runs, held-out releases, and production slices for agent engineering.
How Great AI Engineers Build Agents: Datasets, Scores, and Harnesses That Improve
Why strong AI engineers build datasets, scorecards, traces, and improvement loops instead of treating agents as prompts plus tools.
Harness Candidates Are Model Checkpoints: How to Improve Agents Without Silent Mutation
How to treat every prompt, retrieval, tool, policy, and evaluator change as a scored, reviewed, reversible harness candidate.
Scorecards Over Vibes: The Five Metrics That Keep Agents Honest
The five metrics that keep agents honest: policy, utility, latency, safety, and economics.
Trace Review Is the Agent Debugger: Grade the Path, Not Just the Answer
How trace review grades the path, not just the answer, by inspecting context, plans, tools, guardrails, critic verdicts, and corrections.