12.3 C
New York
Monday, April 13, 2026

Brokers don’t know what attractiveness like. And that’s precisely the issue. – O’Reilly



Luca Mezzalira, writer of Constructing Micro-Frontends, initially shared the next article on LinkedIn. It’s being republished right here together with his permission.

Each few years, one thing arrives that guarantees to vary how we construct software program. And each few years, the business splits predictably: One half declares the previous guidelines useless; the opposite half folds its arms and waits for the hype to cross. Each camps are often improper, and each camps are often loud. What’s rarer, and extra helpful, is somebody standing in the course of that noise and asking the structural questions: Not “What can this do?” however “What does it imply for the way we design methods?”

That’s what Neal Ford and Sam Newman did in their latest fireplace chat on agentic AI and software program structure throughout O’Reilly’s Software program Structure Superstream. It’s a dialog price pulling aside rigorously, as a result of a few of what they floor is extra uncomfortable than it first seems.

The Dreyfus entice

Neal opens with the Dreyfus Mannequin of Information Acquisition, initially developed for the nursing career however relevant to any area. The mannequin maps studying throughout 5 levels:

  • Novice
  • Superior newbie
  • Competent
  • Proficient
  • Skilled

His declare is that present agentic AI is caught someplace between novice and superior newbie: It might probably observe recipes, it may well even apply recipes from adjoining domains when it will get caught, but it surely doesn’t perceive why any of these recipes work. This isn’t a minor limitation. It’s structural.

The canonical instance Neal provides is gorgeous in its simplicity: An agent tasked with making all exams cross encounters a failing unit take a look at. One completely legitimate technique to make a failing take a look at cross is to switch its assertion with assert True. That’s not a hack within the agent’s thoughts. It’s an answer. There’s no moral framework, no skilled judgment, no intuition that claims this isn’t what we meant. Sam extends this instantly with one thing he’d actually seen shared on LinkedIn that week: an agent that had modified the construct file to silently ignore failed steps quite than repair them. The construct handed. The issue remained. Congratulations all-round.

What’s attention-grabbing right here is that neither Ford nor Newman are being dismissive of AI functionality. The purpose is extra delicate: The creativity that makes these brokers genuinely helpful, their potential to go looking resolution area in methods people wouldn’t suppose to, is inseparable from the identical property that makes them harmful. You may’t absolutely lobotomize the improvization with out destroying the worth. It is a design constraint, not a bug to be patched.

And whenever you zoom out, that is a part of a broader sign. When skilled practitioners who’ve spent many years on this business independently converge on requires restraint and rigor quite than acceleration, that convergence is price being attentive to. It’s not pessimism. It’s sample recognition from individuals who’ve lived by way of sufficient cycles to know what the warning indicators seem like.

Habits versus capabilities

One of the necessary issues Neal says, and I believe it will get misplaced within the general density of the dialog, is the excellence between behavioral verification and functionality verification.

Behavioral verification is what most groups default to: unit exams, practical exams, integration exams. Does the code do what it’s speculated to do in accordance with the spec? That is the pure match for agentic tooling, as a result of brokers are literally getting fairly good at implementing habits towards specs. Give an agent a well-defined interface contract and a transparent set of acceptance standards, and it’ll produce one thing that broadly satisfies them. That is actual progress.

Functionality verification is tougher. A lot tougher. Does the system exhibit the operational qualities it must exhibit at scale? Is it correctly decoupled? Is the safety mannequin sound? What occurs at 20,000 requests per second? Does it fail gracefully or catastrophically? These are issues that the majority human builders wrestle with too, and brokers have been skilled on human-generated code, which suggests they’ve inherited our failure modes in addition to our successes.

This brings me to one thing Birgitta Boeckeler raised at QCon London that I haven’t been capable of cease fascinated with. The instance everybody cites when making the case for AI’s coding functionality is that Anthropic constructed a C compiler from scratch utilizing brokers. Spectacular. However right here’s the factor: C compiler documentation is very well-specified and battle-tested over many years, and the take a look at protection for compiler habits is a number of the most rigorous in your complete software program business. That’s as near a solved, well-bounded downside as you may get.

Enterprise software program is nearly by no means like that. Enterprise software program is ambiguous necessities, undocumented assumptions, tacit data residing within the heads of people that left three years in the past, and take a look at protection that exists extra as aspiration than actuality. The hole between “can construct a C compiler” and “can reliably modernize a legacy ERP” just isn’t a spot of uncooked functionality. It’s a spot of specification high quality and area legibility. That distinction issues enormously for the way we take into consideration the place agentic tooling can safely function.

The present orthodoxy in agentic growth is to throw extra context on the downside: elaborate context recordsdata, structure choice data, tips, guidelines about what to not do. Ford and Newman are appropriately skeptical. Sam makes the purpose that there’s now empirical proof suggesting that as context file dimension will increase, you see degradation in output high quality, not enchancment. You’re not guiding the agent towards higher judgment. You’re simply accumulating scar tissue from earlier disasters. This isn’t distinctive to agentic workflows both. Anybody who has labored critically with code assistants is aware of that summarization high quality degrades as context grows, and that this degradation is simply partially controllable. That has a direct affect on selections revamped time; now shut your eyes for a second and picture doing it throughout an enterprise software program, with many groups throughout totally different time zones. Don’t get me improper, the instruments assist, however the assistance is bounded, and that boundary is usually nearer than we’d prefer to admit.

The extra trustworthy framing, which Neal alludes to, is that we want deterministic guardrails round nondeterministic brokers. No more prompting. Architectural health features, an thought Ford and Rebecca Parsons have been selling since 2017, really feel like they’re lastly about to have their second, exactly as a result of the price of not having them is now instantly seen.

What ought to an agent personal then?

That is the place the dialog will get most attention-grabbing, and the place I believe the sector is most confused.

There’s a seductive logic to the microservice because the unit of agentic regeneration. It sounds small. The phrase micro is within the identify. You may think about handing an agent a service with an outlined API contract and saying: implement this, take a look at it, accomplished. The scope feels manageable.

Ford and Newman give this concept honest credit score, however they’re additionally trustworthy in regards to the hole. The microservice stage is engaging architecturally as a result of it comes with an implied boundary: a course of boundary, a deployment boundary, usually a knowledge boundary. You may put health features round it. You may say this service should deal with X load, keep Y error charge, expose Z interface. In principle.

In follow, we barely implement these items ourselves. The brokers have discovered from a corpus of human-written microservices, which suggests they’ve discovered from the overwhelming majority of microservices that have been written with out correct decoupling, with out actual resilience pondering, with none rigorous capability planning. They don’t have our aspirations. They’ve our habits.

The deeper downside, which Neal raises and which I believe deserves extra consideration than it will get, is transactional coupling. You may design 5 superbly bounded providers and nonetheless produce an architectural catastrophe if the workflow that ties them collectively isn’t thought by way of. Sagas, occasion choreography, compensation logic: That is the stuff that breaks actual methods, and it’s additionally the stuff that’s hardest to specify, hardest to check, and hardest for an agent to cause about. We made precisely this error within the SOA period. We designed beautiful little providers after which found that the attention-grabbing complexity had merely migrated into the mixing layer, which no one owned and no one examined.

Sam’s line right here is price quoting immediately, roughly: “To err is human, but it surely takes a pc to essentially screw issues up.” I believe we’re going to supply some genuinely legendary transaction administration disasters earlier than the sector develops the muscle reminiscence to keep away from them.

The sociotechnical hole no one is speaking about

There’s a dimension to this dialog that Ford and Newman gesture towards however that I believe deserves way more direct examination: the query of what occurs to the people on the opposite facet of this generated software program.

It’s not fully correct to say that every one agentic work is occurring on greenfield initiatives. There are instruments already in manufacturing serving to groups migrate legacy ERPs, modernize previous codebases, and sort out the modernization problem that has defeated standard approaches for years. That’s actual, and it issues.

However the problem in these instances isn’t merely the code. It’s whether or not the sociotechnical system, the groups, the processes, the engineering tradition, the organizational buildings constructed across the current software program are able to inherit what will get constructed. And right here’s the factor: Even when brokers mixed with deterministic guardrails may produce a well-structured microservice structure or a clear modular monolith in a fraction of the time it could take a human group, that architectural output doesn’t mechanically include organizational readiness. The system can arrive earlier than the individuals are ready to personal it.

One of many underappreciated features of iterative migration, the incremental strangler fig strategy, the gradual decomposition of a monolith over 18 months, just isn’t primarily danger discount, although it does that too. It’s studying. It’s the method by which a group internalizes a brand new approach of working, makes errors in a bounded context, recovers, and builds the judgment that lets them function confidently within the new world. Compress that journey too aggressively and you may find yourself with structure whose operational complexity exceeds the organizational capability to handle it. That hole tends to be costly.

At QCon London, I requested Patrick Debois, after a chat overlaying finest practices for AI-assisted growth, whether or not making use of all of these practices constantly would make him comfy engaged on enterprise software program with actual complexity. His reply was: It relies upon. That felt just like the trustworthy reply. The tooling is bettering. Whether or not the people round it are holding tempo is a separate query, and one the business just isn’t spending practically sufficient time on.

Current methods

Ford and Newman shut with a topic that nearly by no means will get lined in these conversations: the huge, unglamorous majority of software program that already exists and that our society depends upon in methods which are straightforward to underestimate.

A lot of the discourse round agentic AI and software program growth is implicitly greenfield. It assumes you’re beginning contemporary, that you just get to design your structure sensibly from the start, that you’ve clear APIs and tidy service boundaries. The fact is that the majority invaluable software program on this planet was written earlier than any of this existed, runs on platforms and languages that aren’t the pure habitat of contemporary AI tooling, and incorporates many years of gathered selections that no one absolutely understands anymore.

Sam is engaged on a guide about this: tips on how to adapt current architectures to allow AI-driven performance in methods which are really protected. He makes the attention-grabbing level that current methods, regardless of their popularity, typically provide you with a head begin. A well-structured relational schema carries implicit that means about information possession and referential integrity that an agent can really cause from. There’s construction there, if you know the way to learn it.

The final lesson, which he states with out a lot drama, is that you would be able to’t simply expose an current system by way of an MCP server and name it accomplished. The interface just isn’t the structure. The dangers round safety, information publicity, and vendor dependency don’t go away since you’ve wrapped one thing in a brand new protocol.

This issues greater than it may appear, as a result of the software program that runs our monetary methods, our healthcare infrastructure, our logistics and provide chains, just isn’t greenfield and by no means can be. If we get the modernization of these methods improper, the results should not summary. They’re social. The intuition to index closely on what these instruments can do in ideally suited circumstances, on well-specified issues with good documentation and thorough take a look at protection, is comprehensible. However it’s precisely the improper intuition when the methods in query are those our lives rely on. The architectural mindset that has served us properly by way of earlier paradigm shifts, the one which begins with trade-offs quite than capabilities, that asks what we’re giving up quite than simply what we’re gaining, just isn’t elective right here. It’s the minimal requirement for doing this responsibly.

What I take away from this

Three issues, principally.

The primary is that introducing deterministic guardrails into nondeterministic methods just isn’t elective. It’s crucial. We’re nonetheless determining precisely the place and the way, however the framing must shift: The aim is management over outcomes, not simply oversight of output. There’s a distinction. Output is what the agent generates. Consequence is whether or not the system it generates really behaves appropriately underneath manufacturing circumstances, stays inside architectural boundaries, and stays operable by the people accountable for it. Health features, functionality exams, boundary definitions: the boring infrastructure that connects generated code to the true constraints of the world it runs in. We’ve had the instruments to construct this for years.

The second is that the individuals saying that is the long run and the individuals saying that is simply one other hype cycle are each in all probability improper in attention-grabbing methods. Ford and Newman are cautious to say they don’t know what attractiveness like but. Neither do I. However we have now higher prior artwork to attract on than the discourse often acknowledges. The ideas that made microservices work, after they labored, actual decoupling, specific contracts, operational possession, apply right here too. The ideas that made microservices fail, leaky abstractions, distributed transactions dealt with badly, complexity migrating into integration layers, will trigger precisely the identical failures, simply quicker and at bigger scale.

The third is one thing I took away from QCon London this 12 months, and I believe it could be crucial of the three. Throughout two days of talks, together with periods that took diametrically reverse approaches to integrating AI into the software program growth lifecycle, one factor grew to become clear: We’re all learners. Not within the dismissive sense however in probably the most literal utility of the Dreyfus mannequin. No person, no matter expertise, has found out the appropriate technique to match these instruments inside a sociotechnical system. The recipes are nonetheless being written. The conflict tales that can finally change into the prior artwork are nonetheless taking place to us proper now.

What bought us right here, collectively, was sharing what we noticed, what labored, what failed, and why. That’s how the sector moved from SOA disasters to microservices finest practices. That’s how we constructed a shared vocabulary round health features and evolutionary structure. The identical course of has to occur once more, and it’ll, however provided that individuals with actual expertise are trustworthy in regards to the uncertainty quite than performing confidence they don’t have. The pace, in the end, is each the chance and the hazard. The expertise is shifting quicker than the organizations, the groups, and the skilled instincts that want to soak up it. The most effective response to that isn’t to faux in any other case. It’s to maintain evaluating notes.

If this resonated, the full fireplace chat between Neal Ford and Sam Newman is price watching in its entirety. They cowl extra floor than I’ve had area to react to right here. And if you happen to’d prefer to be taught extra from Neal, Sam, and Luca, take a look at their most up-to-date O’Reilly books: Constructing Resilient Distributed Programs, Structure as Code, and Constructing Micro-frontends, second version.

Related Articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Stay Connected

0FansLike
0FollowersFollow
0SubscribersSubscribe
- Advertisement -spot_img

Latest Articles