Coding Agent Adoption · Engineering Organisations
Zero Slack
Busy & Broken

Why engineering teams fail to extract value from AI coding agents — and the four things to fix, in order.

Scoped to coding agent adoption in software engineering orgs
88% use AI somewhere 6% are high performers
The Bootstrap Trap

AI does create surface-level slack — 97 minutes saved per week from summarisation features alone. But surface adoption bootstraps; deep transformation cannot. 88% of organisations use AI somewhere. Only ~6% achieve high-performer status with meaningful delivery impact. The four levels below explain the gap.

L4
Eval Engineering
The Eval Engineering Gap

Software has unit tests, CI/CD, DORA metrics. The AI development system — your CLAUDE.md, your skills, your compound engineering workflow, your agent instructions — has none of that. You change something. You have no systematic way to know if it helped, hurt, or made no difference.

95% of AI pilots → zero P&L impact No industry standard exists yet
"Did outcomes improve because the agent got better instructions — or because the task was easier that day? You cannot tell. You are not compounding. You are guessing."
requires
L3
Paradigm Shift
No Coders — Only Architects Need Apply

The engineer's job has moved upstream. It no longer lives in writing code. It lives in writing specifications, defining evaluation criteria, and directing agents. Engineers still acting as coders are not slower at the new job — they are doing the wrong job entirely.

87% accuracy — clear spec + eval criteria 19% accuracy — vague multi-file brief
"Senior engineers may start to see the writing on the wall: our jobs are shifting from 'How do I code this?' to 'How do I get the right code built?' — a subtle but profound change." — Addy Osmani, Google
requires
L2
Process
Invisible Process

Agents can only act on what is explicitly documented. The tribal knowledge in people's heads, the architecture decisions made in Slack threads, the conventions assumed but never written — all invisible to the agent. When it can't find context, it hallucinates. And a codebase optimised for human navigation is hostile to agent operation.

70–80% of org knowledge is tacit PRs merged +98% · review time +91% · delivery unchanged
"Optimising for AI agents is really just about removing ambiguity and making implicit knowledge explicit. In other words: it's just good engineering." — Aaron Gustafson, Microsoft
requires
L1
Root Constraint
Zero Slack

Engineering teams run at ~98% utilisation. Queue theory is precise: as utilisation approaches 100%, wait times approach infinity. Improvement tasks queue indefinitely. The team never reaches the productive state — or arrives there superficially and makes things worse. This is not a culture problem. It is a mathematical property of loaded systems.

~98% team utilisation (Reinertsen) 19% slower with AI · believe 20% faster (METR RCT) DORA: AI adoption → delivery stability −7.2%
"People are too busy with current delivery commitments to invest time in AI adoption. There is no protected company-wide time allocation for this." — Antti, F-Secure, 2026
Central Claim

The bottleneck is never the model.
The bottleneck is the harness —
the structured environment around it.

You can buy the same Claude, the same Copilot, the same Codex. That is not a lead. What you cannot buy is the operating environment your team has built: protected capacity to learn, documented processes agents can act on, engineers who design systems rather than write lines, and eval infrastructure that verifies whether any of it is actually getting better. That compounds. That is hard to close.

Level The question to ask The honest answer in most orgs
L1 — Zero Slack Do we have protected time specifically for AI coding adoption? No. It competes with sprint commitments.
L2 — Invisible Process Could an agent, given our repo, do meaningful work without asking anyone? No. It would hallucinate half the context.
L3 — No Coders Are engineers writing specs and eval criteria before starting agent tasks? No. They're still prompting to get code.
L4 — Eval Engineering When we change our agent configuration, do we know if it improved? No. We assume. We don't test.