Software Delivery Squad
Human-supervised software agents that plan, implement, test, review, and open pull requests with traceable decision records.
Purpose
Turn a well-scoped engineering task into a governed delivery run. ContextOS gives software agents the same boundaries a senior engineer expects: understand the ticket, inspect the repository, propose a plan, edit only owned files, run validation, request review, and preserve a replayable trail.
Why this is agentic-first
Software delivery is an ideal agentic workflow because it is multi-step, tool-heavy, and full of feedback loops. The agent has to read requirements, search the codebase, form a plan, edit files, run tests, interpret failures, revise, and hand work to humans at the right checkpoint.
Atlassian’s HULA research describes a human-in-the-loop coding agent that reads a Jira work item, creates a plan, writes code, and raises a pull request. IBM also frames enterprise IT delivery as an agentic workflow spanning code generation, testing, and deployment.
Context Pack
| Layer | Required entries |
|---|---|
decision_layer.decision_specs[] | delivery.plan.approve, delivery.patch.prepare, delivery.pr.open, delivery.deploy.request |
policy_layer.policy_bundles[] | Repository ownership, change-risk tiers, secure coding policy, release policy |
policy_layer.approval_gates[] | GATE_PLAN_REVIEW, GATE_PR_REVIEW, GATE_DEPLOY_APPROVAL |
tooling_layer.adapter_registry[] | adp_repo.search, adp_repo.patch, adp_ci.run, adp_pr.open, adp_issue.update, adp_release.request |
memory_layer.write_classes_allowed | codebase_pattern, review_correction, decision_outcome |
evaluation_layer.eval_targets[] | test pass rate, lint pass rate, review rework rate, rollback rate |
Agent roles
| Agent | Responsibility | Boundary |
|---|---|---|
| Requirements Agent | Resolves ticket scope, acceptance criteria, and non-goals. | Read-only issue and docs access. |
| Codebase Agent | Finds relevant files, owners, APIs, and local conventions. | Read-only repo access. |
| Implementation Agent | Writes the patch inside declared ownership. | Cannot change protected files without gate. |
| Validation Agent | Runs tests, type checks, lint, and static checks. | Cannot merge or deploy. |
| Review Agent | Checks risk, security, docs, and acceptance criteria. | Can block PR creation. |
Execution flow
invokeAgentarrives withintent=delivery.implement, issue ID, repository, branch policy, and target owners.- Compiler pins repository state, issue text, coding standards, ownership map, and allowed tools.
- Requirements Agent extracts acceptance criteria and detects ambiguity.
- Codebase Agent builds a focused file map and dependency map.
- Implementation Agent proposes a plan;
GATE_PLAN_REVIEWtriggers for broad or risky changes. - Implementation Agent edits only the approved write set.
- Validation Agent runs local and remote checks, summarizes failures, and loops back to implementation when safe.
- Review Agent verifies acceptance criteria, security posture, and touched ownership.
adp_pr.openruns only after policy acceptance and required evidence are present.- ContextOS emits a
DecisionRecordwithdecision_key="delivery.pr.open", diff summary, tests, approvals, and trace.
Decision gates
| Gate | Trigger | Required evidence |
|---|---|---|
GATE_PLAN_REVIEW | Cross-module work, schema migrations, auth changes, payment logic, or security-sensitive code. | issue scope, file map, owners, rollback plan |
GATE_PR_REVIEW | PR creation against protected repository. | passing checks or explicit waiver, diff summary, reviewer list |
GATE_DEPLOY_APPROVAL | Any production deploy request. | release notes, CI evidence, rollback command, owner approval |
Failure modes
- Ambiguous ticket - Requirements Agent stops the run and asks for clarification instead of guessing.
- Ownership drift - Compiler blocks edits outside the approved write set.
- Validation loop without progress - Orchestrator caps retries and emits a diagnosis artifact.
- Unsafe generated code - Review Agent blocks PR creation when security or data-handling rules fail.
- Stale branch - Tool Gateway refuses PR creation until the branch is rebased or revalidated.
Metrics
- Plan approval rate.
- First-pass test and typecheck pass rate.
- Review rework rate and reviewer override rate.
- Time from issue intake to PR.
- Production rollback and hotfix rate for agent-authored changes.
- Number of learned codebase patterns promoted to memory.
Research signals
- Atlassian’s HULA research describes plan generation, code generation, validation, and PR creation with human review.
- IBM Enterprise Advantage frames autonomous development workflows as a path to faster IT delivery.
- McKinsey’s agentic AI security guidance calls for ownership, access control, and traceability across agentic systems.