4 C
New York
Tuesday, March 24, 2026

The Legendary Agent-Month – O’Reilly



The next article initially appeared on Wes McKinney’s weblog and is being republished right here with the creator’s permission.

Like lots of people, I’ve discovered that AI is horrible for my sleep schedule. Previously I’d get up briefly at 4:00 or 4:30 within the morning to have a sip of water or use the toilet; now I’ve hassle going again to sleep. I might be doing issues. Earlier than I might get a strong 7–8 hours an evening; now I’m fortunate after I get 6. I’ve largely stopped preventing it: Now after I’m rolling round restlessly in mattress at 5:07am with concepts to feed my AI coding brokers, I simply stand up and begin my day.

Amongst my internal circle of engineering and knowledge science associates, there may be loads of dialogue about how lengthy our aggressive edge as people will final. Will having good concepts (and plenty of them) nonetheless matter because the brokers start having higher concepts themselves? The human-expert-in-the-loop feels important now to get good outcomes from the brokers, however how lengthy will that final till our wildest concepts might be changed into working, tasteful software program whereas we sleep? Will or not it’s a mild obsolescence the place we fortunately hand off the reins or one thing else?

For now, I really feel wanted. I don’t describe the way in which I work now as “vibe coding” as this appears like a pejorative “immediate and chill” approach of constructing AI slop software program initiatives. I’ve been constructing instruments like roborev to convey rigor and steady supervision to my parallel agent classes, and to closely scrutinize the work that my brokers are doing. With this radical new approach of working it’s laborious to not be contemplative about the way forward for software program engineering.

Most likely the guide I’ve referenced essentially the most in my profession is The Legendary Man-Month by Fred Brooks, whose now-famous Brooks’s regulation argues that “including manpower to a late software program challenge makes it later.” These days I discover myself asking whether or not the teachings from this guide are relevant on this new period of agentic growth. Will a gifted developer orchestrating a swarm of AI brokers have the ability to construct complicated software program quicker and higher, and can the short-term productiveness good points result in long-term challenge success? Or will we run into the identical bottlenecks—scope creep, architectural drift, and coordination overhead—which have plagued software program groups for many years?

Revisiting The Legendary Man-Month (TMMM)

One in all Brooks’s central arguments is that small groups of elite folks outperform massive groups of common ones, with one “chief surgeon” supported by specialists. This results in a excessive diploma of conceptual integrity in regards to the system design, as if “one thoughts designed it, even when many individuals constructed it.”

Agentic engineering seems to amplify these issues, because the high quality of the software program being constructed is now solely pretty much as good because the people within the loop curating and refining specs, saying sure or no to options, and taming pointless code and architectural complexity. One of many metaphors in TMMM is the “tar pit”: “Everybody can see the beasts struggling in it, and it seems to be like all considered one of them might simply free itself, however the tar holds all of them collectively.” Now, we have now a brand new “agentic tar pit” the place our parallel Claude Code classes and git worktrees are engaged in fight with the code bloat and incidental complexity generated by their digital colleagues. You possibly can systematically refactor, however invariably an agentic codebase will find yourself bigger and extra overwrought than something constructed by human hand. That is technical debt on an unprecedented scale, accrued at machine pace.

In TMMM, Brooks noticed {that a} working program is perhaps 1/ninth the way in which to a programming product, one which has the mandatory testing, documentation, and hardening towards edge circumstances and is maintainable by somebody apart from its creator. Brokers are actually making the “working program” (or “appears-to-work” program, extra precisely) an incredible deal extra accessible, although many newly minted AI vibe coders clearly underestimate the work concerned with going from prototype to manufacturing.

These issues compound when contemplating the closely-related Conway’s regulation, which asserts that the structure of software program techniques tends to resemble the organizations’ staff or communication construction. What does that seem like when utilized to a digital “staff” of brokers with no persistent reminiscence and no shared understanding of the system they’re constructing?

One other “huge thought” from TMMM that has caught with folks is the n(n-1)/2 coordination downside as groups scale. With agentic engineering, there are fewer people concerned, so the coordination downside doesn’t disappear however quite adjustments form. Completely different agent classes might produce contradictory plans that people need to reconcile. I’ll go away this agent orchestration query for an additional put up.

No silver bullet

“There isn’t any single growth, in both know-how or administration method, which by itself guarantees even one order-of-magnitude enchancment inside a decade in productiveness, in reliability, in simplicity.”
—“No Silver Bullet” (1986)

Brooks wrote a follow-up essay to TMMM to take a look at software program design by means of the lens of important complexity and unintended complexity. Important complexity is key to attaining your purpose: For those who made the system any easier, it will fall in need of its downside assertion. Unintentional complexity is every little thing else imposed by our instruments and processes: programming languages, instruments, and the layer of design and documentation to make the system comprehensible by engineers.

Coding brokers are in all probability essentially the most highly effective software ever created to deal with unintended complexity. To suppose: I mainly don’t write code anymore, and now write tons of code in a language (Go) I’ve by no means written by hand. There’s loads of dialogue about whether or not IDEs are nonetheless going to be related in a yr or two, when perhaps all we’d like is a textual content editor to assessment diffs. The productiveness good points are monumental, and I say this as somebody burning north of 10 billion tokens a month throughout Claude, Codex, and Gemini.

However Brooks’s “No Silver Bullet” argument predicts precisely the issue I’m experiencing in my agentic engineering: The unintended complexity isn’t any downside in any respect anymore, however what’s left is the important complexity which was all the time the laborious half. Brokers can’t reliably inform the distinction. LLMs are extraordinary sample matchers educated on the whole thing of humanity’s open supply software program, so whereas they’re sensible at coping with unintended complexity (refactor this code, write these checks, clear up this mess), they battle with the extra refined important design issues, which regularly haven’t any precedent to sample match towards. In addition they typically are inclined to introduce pointless complexity, producing massive quantities of defensive boilerplate that’s hardly ever wanted in real-world use.

Put one other approach, brokers are so good at attacking unintended complexity that they generate new unintended complexity that may get in the way in which of the important construction that you’re attempting to construct. With a few my new initiatives, roborev and msgvault, I’m already coping with this downside as I start to succeed in the 100 KLOC mark and watch the brokers start to chase their very own tails and contextually choke on the bloated codebases they’ve generated. In some unspecified time in the future past that (the subsequent 100 KLOC, or 200 KLOC) issues begin to collapse: Each new change has to hack by means of the code jungle created by prior brokers. Name it a “brownfield barrier.” At Posit we have now seen brokers battle way more in 1 million-plus-line codebases resembling Positron, a VS Code fork. This appears to assist Brooks’s complexity scaling argument.

I might hesitate to position a wager on whether or not the current is a ceiling or a plateau. The fashions are clearly getting higher quick, and the issues I’m describing right here might look charmingly quaint in two years. However Brooks’s important/unintended distinction offers me some confidence that this isn’t simply in regards to the present limitations of the know-how. Determining what to construct was the laborious half lengthy earlier than we had LLMs, and I don’t see how a flawless coding agent adjustments that.

Agentic scope creep

When producing code is free, realizing when to say “no” is your final protection.

With the price of producing code now converging to zero, there may be virtually nothing stopping brokers and their human taskmasters from pursuing all avenues that may have beforehand been price or time prohibitive. The temptation to spend your day prompting “and now are you able to simply…?” is overwhelming. However any new generated function or subsystem, whereas low-cost to create, isn’t costless to take care of, take a look at, debug, and cause about sooner or later. What appears free now carries a future contextual burden for future agent classes, and every new bell or whistle turns into a brand new vector of brittleness or bugs that may hurt customers.

From this attitude, constructing nice software program initiatives perhaps by no means was about how briskly you may sort the code. We are able to “sort” 10x, perhaps 100x quicker with brokers than we might earlier than. However we nonetheless need to make good design selections, say no to most product concepts, preserve conceptual integrity, and know when one thing is “carried out.” Brokers are accelerating the “straightforward half” whereas paradoxically making the “laborious half” doubtlessly much more tough.

Agentic scope creep additionally appears to be actively destroying the open supply software program world. Now that the bar is decrease than ever for contributors to leap in and provide assist, initiatives are drowning in torrents of three,000-line “useful” PRs that add new options. As builders change into more and more hands-off and disengaged from the design and planning course of, the brokers’ runaway scope creep can get uncontrolled rapidly. When the individual submitting a pull request didn’t write or totally learn the code in it, there’s doubtless nobody concerned who’s really accountable for the design selections.

I’ve seen in my very own work on roborev and msgvault that brokers will suggest overwrought options to issues when a easy resolution would just do nice. It takes judgment to know when to intervene and learn how to preserve the agent in verify.

Design and style as our final foothold

Brooks’s argument is that design expertise and good style are essentially the most scarce sources, and now with brokers doing the entire coding labor, I argue that these expertise matter extra now than ever. The bottleneck was by no means arms on keyboards. Now with the brand new “Legendary Agent-Month,” we will fairly conclude that design, product scoping, and style stay the sensible constraints on delivering high-quality software program. The builders who thrive on this new agentic period received’t be those who run essentially the most parallel classes or burn essentially the most tokens. They’ll be those who’re capable of maintain their initiatives’ conceptual fashions of their thoughts, who’re shrewd about what to construct and what to go away out, and train style over the large quantity of output.

The Legendary Man-Month was revealed in 1975, greater than 50 years in the past. In that point, quite a bit has occurred: super progress in {hardware} efficiency, programming languages, growth environments, cloud computing, and now massive language fashions. The instruments have modified, however the constraints are nonetheless the identical.

Possibly I’m attempting to justify my very own continued relevance, however the actuality is extra complicated than that. Not all software program is created equal: CRUD enterprise productiveness apps aren’t the identical as databases and different important techniques software program. I feel the median software program consulting store is totally toast. However my thesis is extra about growth work within the 1% tail of the distribution: issues inaccessible to most engineers. It will proceed to require skilled people within the loop, even when they aren’t doing a lot or any guide coding. As one current adjoining instance, my good friend Alex Lupsasca at OpenAI and his world-class physicist collaborators have been capable of create a formulation of a tough physics downside and arrive at an answer with AI’s assist. With out such specialists within the loop, it’s way more doubtful whether or not LLMs would have the ability to each pose the questions and provide you with the options.

For now, I’ll in all probability nonetheless be getting away from bed at 5am to feed and tame my brokers for the foreseeable future. The coding is less complicated now, and actually extra enjoyable, and I can spend my time fascinated about what to construct quite than wrestling with the instruments and techniques across the engineering course of.

Due to Martin Blais, Josh Bloom, Phillip Cloud, Jacques Nadeau, and Dan Shapiro for giving suggestions on drafts of this put up.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles