AGENTS.md Done Right: The Navigation File That Actually Helps Coding Agents

The first AGENTS.md I wrote was 612 lines.

It had three sections on coding style, two on deployment rules, a copied incident-response runbook, a tour of our services, a paragraph about why we use TypeScript, and a long list of things the agent should “prefer.” It felt responsible. It was mostly noise.

The agent read the first page, absorbed the vibe, and then behaved like a generic coding assistant with a vague sense that the company had opinions.

I rewrote the file to 41 lines. The same agent started finding the right runbook, running the right verification command, and refusing changes that crossed policy boundaries. Nothing about the model changed. The harness around the model became navigable.

That is the job of AGENTS.md: not to explain the whole repo, not to replace docs, and not to become a second system prompt. It is the navigation file that tells coding agents where truth lives, what commands prove work, and which actions require a hard stop.

The official AGENTS.md format frames it as a README for agents. OpenAI Codex documents how it discovers global, project, and nested AGENTS.md guidance, with closer files overriding broader ones. The Meta-Harness result points in the same direction from research: better harnesses come from giving agents access to source, scores, traces, and concise skills, not from compressing everything into a monologue.

This post is the field version: what belongs in AGENTS.md, what should be moved out, how to structure it for nested repos, and the template I would ship today.

Field guide

AGENTS.md should route, constrain, and verify.

A useful file gives the agent a map, a few hard boundaries, and the exact commands that prove a change. Everything else should live in the file it points to.

Route

Point to source files

Architecture, security, reliability, and harness contracts stay in their owned docs.

Constrain

Name forbidden actions

Pushing to main, bypassing gates, editing policy, and touching secrets need explicit boundaries.

Verify

Give runnable checks

The best style rule is a command the agent can run and fix until it passes.

What changed

AGENTS.md is now broad enough that the old “team-specific prompt file” framing undersells it. Three facts matter operationally.

Fact	What it changes
It is a shared, Markdown-based convention used across many coding agents	Keep the file plain, portable, and readable by tools beyond one vendor
Codex builds an instruction chain from global, project, and nested files	Use root guidance for repo-wide rules and nested files for subsystem rules
The closer file wins when instructions conflict	Put payment, security, data, or mobile-specific rules near the code they govern

The practical consequence: stop trying to make one root file cover every situation. A root AGENTS.md should set defaults and point to durable sources. A nested services/payments/AGENTS.md should override the checks and boundaries for payments. A nested packages/mobile/AGENTS.md should name mobile-specific test commands. The root should not pretend to know all of that forever.

The anti-pattern: the prompt dump

Here is what AGENTS.md tends to become if nobody owns it. This is representative, lightly redacted:

# AGENTS.md
 
You are a senior engineer at AcmeCo. We value high-quality code, good
test coverage, and clear documentation. Our codebase is mostly
TypeScript with some Python services for data processing.
 
## Coding Style
- Use 2-space indentation
- Prefer functional components in React
- Always use named exports
- Tests go in __tests__/ directories alongside the source
- We use Jest for unit tests and Playwright for e2e
- ... (47 more lines on style)
 
## Architecture
We have a microservices architecture with the following services:
- api-gateway (handles auth, routing)
- user-service (user CRUD, profiles)
- billing-service (Stripe integration)
- ... (12 more services)
 
## Deployment
We deploy via GitHub Actions on merge to main. Production deploys
require approval from a team lead. ... (60 more lines)
 
## Things you should never do
- Don't push to main directly
- Don't disable tests
- Don't add new dependencies without approval
- Don't refactor unrelated code
- ... (28 more bullets)
 
## Helpful context
We are a B2B SaaS company. Our customers are mostly mid-market...
(continues for 200 more lines on company history)

This file has three problems.

It treats AGENTS.md as documentation. The agent has to read it linearly. There is no natural first action, no routing rule, and no clear distinction between a preference and a hard boundary.

It encodes facts that age. “We use Jest” is true until Tuesday. “Our services are X, Y, Z” is true until the next reorg. Stale instructions are worse than missing instructions because the agent follows them with confidence.

It buries the dangerous rules. “Prefer named exports” and “do not push to main” should not be peers. One is style. The other is a control.

The result is a decorative file: long enough to feel serious, too unfocused to change behavior.

The pattern: the routing file

The useful version is shorter and more forceful:

# AGENTS.md
 
You are operating inside this engineering repository.
 
## Start here
 
1. Read ARCHITECTURE.md for system boundaries and ownership.
2. Read SECURITY.md before touching PII, auth, payment, secrets, or access control.
3. Read RELIABILITY.md before changing runtime flows, retries, queues, or fallbacks.
4. Check docs/execution-plans/active/ for in-flight work.
5. Run `make verify` before opening a PR or declaring work complete.
 
## Forbidden without explicit approval
 
- Do not bypass policy gates in harness/policies/.
- Do not push directly to main.
- Do not disable tests, hooks, signing checks, or reviewer gates.
- Do not add production dependencies without approval.
- Do not edit harness/policies/ or harness/tools/ without an ADR.
- Do not read, print, or copy secrets from .env, keychains, vault dumps, or CI logs.
 
## Required for every change
 
- Tests cover the changed path and the rollback path when behavior changes.
- Telemetry covers the changed path with the required span attributes.
- Rollback notes are included in the PR description for runtime changes.
- If verification cannot run, explain exactly why and what residual risk remains.
 
## Harness layout
 
- harness/packs/        Context Pack bundles, versioned and replayed
- harness/policies/     policy bundles; JsonLogic or DSL; reviewed by governance
- harness/tools/        tool manifests with schema, owner, and approval mode
- harness/evals/        golden sets and scenario rubrics
- harness/reviewers/    reviewer-agent skills, one per concern
- harness/feedback/     captured corrections and improvement-loop inputs
 
## When to stop
 
If a change crosses a policy, payment, privacy, destructive-action, or production-data
boundary, stop and ask for approval. Do not silently proceed.

This is not shorter because short is virtuous. It is shorter because each line has a job.

Section	Job
Start here	Tells the agent where durable truth lives
Forbidden without approval	Converts risk into hard stop conditions
Required for every change	Defines completion, not style
Harness layout	Makes packs, policies, tools, evals, and feedback discoverable
When to stop	Prevents ambiguity from becoming action

What belongs in AGENTS.md

A useful AGENTS.md has five kinds of content.

Belongs	Example	Why
Entry points	`Read SECURITY.md before touching auth`	Routes the agent to owned docs
Verification commands	`make verify`, `pnpm test --filter api`	Lets the agent prove work mechanically
Hard boundaries	`Do not edit policy bundles without ADR`	Turns risk into a visible stop condition
Repo map	`harness/policies/`, `docs/execution-plans/active/`	Reduces filesystem wandering
Escalation rules	`Stop on payment, secrets, destructive actions`	Makes uncertainty explicit

Everything else has to earn its place.

What to move out

The fastest way to improve an AGENTS.md is deletion.

Move this out	Put it here instead
Long architecture descriptions	`ARCHITECTURE.md`
Complete security policy	`SECURITY.md`
Service catalog	`docs/services/` or generated service index
Style preferences enforced by tooling	Formatter, linter, type checker
Historical context	ADRs or decision records
One-off task instructions	`docs/execution-plans/active/<task>.md`
Reviewer rubrics	`harness/reviewers/<concern>.md`
Prompt fragments or skills	`harness/skills/` or a scoped `SKILL.md`

The rule is simple: if the agent can discover it by following a pointer or running a command, do not paste it into the root instruction file.

Use nested files deliberately

Nested AGENTS.md files are the clean way to avoid a bloated root. Codex discovers guidance from the project root down to the working directory and gives closer files precedence. The open AGENTS.md guidance makes the same basic recommendation for large repos: place tailored files inside subprojects.

That means this structure is better than one huge root file:

repo/
  AGENTS.md
  ARCHITECTURE.md
  SECURITY.md
  RELIABILITY.md
 
  services/
    payments/
      AGENTS.md
      src/
    search/
      AGENTS.md
      src/
 
  packages/
    web/
      AGENTS.md
    mobile/
      AGENTS.md
 
  harness/
    policies/
      AGENTS.md
    tools/
      AGENTS.md
    evals/
      AGENTS.md

The root file should not know the exact command for every package forever. It should say how to find package-specific instructions and which repo-wide boundaries never change.

The harness/policies/AGENTS.md can be severe:

# harness/policies/AGENTS.md
 
Policy files are production controls.
 
- Do not edit policy bundles without a linked ADR.
- Run `make policy-test` after every change.
- Add or update at least one golden case for every new rule.
- Never weaken a deny, approval, or redaction rule without governance approval.
- Preserve rule ids; add replacements instead of renaming historical rules.

That belongs near policy code, not in the root file. The closer the instruction is to the governed surface, the less likely it is to be ignored or stale.

The harness/ directory it points into

The files AGENTS.md references should actually exist. The ContextOS-aligned layout from Harness Engineering is:

repo/
  AGENTS.md                       # primary navigation for agents
  ARCHITECTURE.md
  SECURITY.md
  RELIABILITY.md
 
  docs/
    decision-records/             # ADRs, dated, append-only
    execution-plans/
      active/
      completed/
    runbooks/
    known-limitations/
 
  harness/
    packs/                        # Context Pack bundles, versioned
    policies/                     # JsonLogic / DSL bundles
    tools/                        # Tool manifests with schema + ownership
    evals/                        # golden sets, scenario rubrics
    fixtures/                     # synthetic scenarios for simulation
    validators/                   # interface tests run before full eval
    observability/                # trace schemas, span attribute conventions
    reviewers/                    # review-agent skills + rubrics
    skills/                       # planner / executor / critic skills
    feedback/                     # captured corrections + lineage

Two principles govern the layout.

Anything the agent must follow is visible, versioned, and machine-checkable. If a policy matters, it lives in the repo or in a referenced registry with a stable id. If it exists only in a wiki, it is background reading, not a control.

The proposer can find prior experience by reading the filesystem. Meta-Harness works because the proposer sees source code, scores, traces, and prior candidates through ordinary files. The same lesson applies to production repos: store decisions, failures, reviewer verdicts, and corrections where agents can inspect them.

Keep the file small on purpose

Small is not an aesthetic preference. It is an operational constraint.

Codex has a documented combined instruction limit for discovered project guidance, and other tools have their own context budgets. Even when the limit is generous, loading 600 lines of mixed advice into every task is wasteful. Worse, the important rule becomes one bullet among many.

Use this budget test:

If the line says…	Keep it?
”Run this exact command before finishing”	Yes
”Never do this without approval”	Yes
”Read this owned file before touching this area”	Yes
”We usually prefer…”	Usually no
”Here is the whole architecture”	No
”Here is a one-time instruction for this migration”	No
”Here is company background”	No

The best AGENTS.md files are not comprehensive. They are selective.

Versioning AGENTS.md

AGENTS.md is itself a harness artifact. Treat it like one.

Trigger	What to do
Quarterly review	Remove stale guidance, check commands, prune unused sections
Repeated agent mistake	Add a constraint, command, or pointer only if it prevents a pattern
Reviewer-agent recurring finding	Promote the underlying rule into AGENTS.md or a nested scoped file
New subsystem	Add a nested file near the subsystem instead of bloating the root
Major architecture change	Update the pointer target first, then AGENTS.md if the route changes

The diff history is diagnostic. If AGENTS.md changes every week, it is probably being used as a scratchpad. If it never changes, it is probably stale. A healthy file changes when the harness learns a durable rule.

A review checklist

Before you merge an AGENTS.md change, check it like production code:

Does every command still run?
Is every linked file present?
Are forbidden actions phrased as hard boundaries, not preferences?
Could a nested file own this rule more precisely?
Is any line duplicating formatter, linter, CI, or policy-engine behavior?
Is any rule stale, team-specific, or historical?
Can an agent decide when to stop and ask?
Would a new engineer understand the repo better after reading it?

If the answer to any of these is no, keep editing.

The thing this changes

When AGENTS.md is a navigation file, the agent treats the repo as a real environment instead of a story.

It reads SECURITY.md before touching auth because the file told it to. It opens harness/policies/ instead of guessing policy from a prompt. It runs make verify because completion is tied to a command. It stops on destructive or regulated changes because the boundary is named plainly.

That is the small but durable shift. The model does not get smarter. The environment gets easier to operate.

Forty lines that route to truth will beat 600 lines of advice almost every time.