Software Delivery Squad

Human-supervised software agents that plan, implement, test, review, and open pull requests with traceable decision records.

Use Case PlaybookLast reviewed: 2026-05-09 Edit on GitHub

At a glance

Purpose

Turn a well-scoped engineering task into a governed delivery run. ContextOS gives software agents the same boundaries a senior engineer expects: understand the ticket, inspect the repository, propose a plan, edit only owned files, run validation, request review, and preserve a replayable trail.

Why this is agentic-first

Software delivery is an ideal agentic workflow because it is multi-step, tool-heavy, and full of feedback loops. The agent has to read requirements, search the codebase, form a plan, edit files, run tests, interpret failures, revise, and hand work to humans at the right checkpoint.

Atlassian’s HULA research describes a human-in-the-loop coding agent that reads a Jira work item, creates a plan, writes code, and raises a pull request. IBM also frames enterprise IT delivery as an agentic workflow spanning code generation, testing, and deployment.

Context Pack

Layer	Required entries
`decision_layer.decision_specs[]`	`delivery.plan.approve`, `delivery.patch.prepare`, `delivery.pr.open`, `delivery.deploy.request`
`policy_layer.policy_bundles[]`	Repository ownership, change-risk tiers, secure coding policy, release policy
`policy_layer.approval_gates[]`	`GATE_PLAN_REVIEW`, `GATE_PR_REVIEW`, `GATE_DEPLOY_APPROVAL`
`tooling_layer.adapter_registry[]`	`adp_repo.search`, `adp_repo.patch`, `adp_ci.run`, `adp_pr.open`, `adp_issue.update`, `adp_release.request`
`memory_layer.write_classes_allowed`	`codebase_pattern`, `review_correction`, `decision_outcome`
`evaluation_layer.eval_targets[]`	test pass rate, lint pass rate, review rework rate, rollback rate

Agent roles

Agent	Responsibility	Boundary
Requirements Agent	Resolves ticket scope, acceptance criteria, and non-goals.	Read-only issue and docs access.
Codebase Agent	Finds relevant files, owners, APIs, and local conventions.	Read-only repo access.
Implementation Agent	Writes the patch inside declared ownership.	Cannot change protected files without gate.
Validation Agent	Runs tests, type checks, lint, and static checks.	Cannot merge or deploy.
Review Agent	Checks risk, security, docs, and acceptance criteria.	Can block PR creation.

Execution flow

invokeAgent arrives with intent=delivery.implement, issue ID, repository, branch policy, and target owners.
Compiler pins repository state, issue text, coding standards, ownership map, and allowed tools.
Requirements Agent extracts acceptance criteria and detects ambiguity.
Codebase Agent builds a focused file map and dependency map.
Implementation Agent proposes a plan; GATE_PLAN_REVIEW triggers for broad or risky changes.
Implementation Agent edits only the approved write set.
Validation Agent runs local and remote checks, summarizes failures, and loops back to implementation when safe.
Review Agent verifies acceptance criteria, security posture, and touched ownership.
adp_pr.open runs only after policy acceptance and required evidence are present.
ContextOS emits a DecisionRecord with decision_key="delivery.pr.open", diff summary, tests, approvals, and trace.

Decision gates

Gate	Trigger	Required evidence
`GATE_PLAN_REVIEW`	Cross-module work, schema migrations, auth changes, payment logic, or security-sensitive code.	issue scope, file map, owners, rollback plan
`GATE_PR_REVIEW`	PR creation against protected repository.	passing checks or explicit waiver, diff summary, reviewer list
`GATE_DEPLOY_APPROVAL`	Any production deploy request.	release notes, CI evidence, rollback command, owner approval

Failure modes

Ambiguous ticket - Requirements Agent stops the run and asks for clarification instead of guessing.
Ownership drift - Compiler blocks edits outside the approved write set.
Validation loop without progress - Orchestrator caps retries and emits a diagnosis artifact.
Unsafe generated code - Review Agent blocks PR creation when security or data-handling rules fail.
Stale branch - Tool Gateway refuses PR creation until the branch is rebased or revalidated.

Metrics

Plan approval rate.
First-pass test and typecheck pass rate.
Review rework rate and reviewer override rate.
Time from issue intake to PR.
Production rollback and hotfix rate for agent-authored changes.
Number of learned codebase patterns promoted to memory.

Research signals

Atlassian’s HULA research describes plan generation, code generation, validation, and PR creation with human review.
IBM Enterprise Advantage frames autonomous development workflows as a path to faster IT delivery.
McKinsey’s agentic AI security guidance calls for ownership, access control, and traceability across agentic systems.