AI Engineering
6 essays tagged with AI Engineering.
Agent Harness: An Architectural Framework for Production AI Agents
A whitepaper on typed contracts, policy gates, traces, verification loops, and release control for production AI agents.
Dataset-First Agent Engineering: The Golden Sets Behind Reliable Agents
A practical guide to golden sets, task distributions, corrected runs, held-out releases, and production slices for agent engineering.
How Great AI Engineers Build Agents: Datasets, Scores, and Harnesses That Improve
Why strong AI engineers build datasets, scorecards, traces, and improvement loops instead of treating agents as prompts plus tools.
Harness Candidates Are Model Checkpoints: How to Improve Agents Without Silent Mutation
How to treat every prompt, retrieval, tool, policy, and evaluator change as a scored, reviewed, reversible harness candidate.
Scorecards Over Vibes: The Five Metrics That Keep Agents Honest
The five metrics that keep agents honest: policy, utility, latency, safety, and economics.
Trace Review Is the Agent Debugger: Grade the Path, Not Just the Answer
How trace review grades the path, not just the answer, by inspecting context, plans, tools, guardrails, critic verdicts, and corrections.