Everybody’s an Engineer Now – O’Reilly

May 4, 2026

1

Cat Wu leads product for Claude Code and Cowork at Anthropic, so she’s well-versed in constructing dependable, interpretable, and steerable AI techniques. And since 90% of Anthropic’s code is now written by Claude Code, she’s additionally deeply acquainted with becoming them into routine day-to-day work. Final month, Cat joined Addy Osmani at AI Codecon for a hearth chat on the way forward for agentic coding and, equally necessary, agentic code evaluate, how Anthropic truly makes use of the instruments they’re constructing, and what abilities matter now for builders.

The suggestions loop is itself a product

Boris Cherny initially constructed Claude Code as a facet venture to check Anthropic’s APIs. Then he shared the software in a pocket book, and inside two months your entire firm was utilizing it. That natural development, Cat mentioned, was a part of what satisfied the crew it was price releasing externally.

However what actually made that inside adoption seen was the response on Anthropic’s inside “dog-fooding” Slack channel. The Claude Code channel will get a brand new message each 5 to 10 minutes across the clock, and this suggestions immediately and instantly informs the product expertise. Cat described it this manner:

We rent for individuals who love sharpening the consumer expertise. And so a whole lot of our engineers truly dwell on this channel and discover when there’s points with new options that they’ve labored on they usually proactively lay out the fixes.

The crew ships new variations of Claude Code to inside customers many instances a day. The suggestions loop is tight sufficient that it capabilities as a steady integration system for product high quality, not simply code high quality.

Cat instructed Addy how she as soon as unintentionally launched a small interplay bug between prompts and auto-suggestions. However by the point she began engaged on an answer, she discovered one other crew member had already crushed her to it. It seems, he had arrange a scheduled process in Claude Code to scan the suggestions channel for something that hadn’t been responded to in 24 hours and open a PR for it. Since Cat hadn’t gotten to it but (whoops!), her teammate’s Claude noticed the unaddressed challenge and stuck it for her. And Cat solely discovered when “[her own] Claude seen that his Claude had already landed a change.”

The infrastructure for fast enchancment, in different phrases, is now partly automated. The brokers are writing the code, then monitoring the suggestions and shutting the loop.

The bottleneck has shifted to evaluate

There’s no query that AI-assisted coding has created a increase in output. Anthropic engineers are producing roughly 200% extra code than they have been a yr in the past, Cat famous. At the moment the primary constraint is reviewing all that code to make sure it’s production-ready.

Cat’s crew concluded which you can purchase a whole lot of extra robustness for not that a lot further value.

We opted for the heaviest, most sturdy model [of code review]. We truly plot what number of brokers and the way complete of a evaluate Claude does after which what number of bugs does it recall. And we picked a variety of very excessive recall and determined we must always ship this, as a result of when you really need AI code evaluate to be a load-bearing a part of your course of, you truly most likely simply need probably the most complete potential evaluate.

The evaluate agent doesn’t simply take a look at the diff. It traces code throughout a number of recordsdata and catches bugs in adjoining code that has nothing to do with the change in query. Cat gave two examples. One was a ZFS encryption refactor the place the agent discovered a key cache invalidation bug that wasn’t associated to the writer’s change in any respect however would have invalidated it. The opposite was a routine auth replace that turned out to have a foul facet impact, caught premerge. In each circumstances, engineers manually reviewing the code probably would have missed the bugs.

The human evaluate that continues to be is intentionally small in scope. For many PRs, the human reviewer skims for design precept violations and apparent issues and assumes practical correctness has been dealt with. 5 to 10 brokers run in parallel, every given barely totally different duties, returning independently after which deduplicating what they discovered.

The cultural shift that made this work, although, was possession. The crew moved to a mannequin the place the engineer who authors a PR owns it finish to finish, together with postdeploy bugs, and doesn’t lean on peer reviewers to catch errors. “In any other case,” as Cat identified, “you’ve got conditions the place junior engineers put out a bunch of PRs after which your senior engineers are like drowning in AI-generated stuff the place they’re unsure how completely it’s been examined.”

Full possession meant the AI evaluate needed to truly be reliable, which drove the choice to go for top recall reasonably than a lighter contact. That mentioned, engineers are nonetheless anticipated to grasp each line of code an agent creates. . .for now. As Cat defined, it’s the one technique to really stop “unknown safety vulnerabilities and to have the ability to rapidly reply to incidents if they’re to occur.”

Everybody’s form of an engineer now

Cowork, Anthropic’s agent software for nontechnical customers, is the corporate’s try to take what Claude Code does for engineers and convey it to data work extra broadly. Cat sketched an image of somebody taking a look at 5 or 6 agent duties working concurrently in a facet panel, managing a fleet of brokers the way in which a senior engineer manages a PR queue.

Within the nearer-term, she’s maintaining tabs on the shift towards folks utilizing Claude Code to construct issues for themselves, their groups, or their households that wouldn’t have justified skilled growth effort or “in any other case been potential.” The prototype is the storage venture, the household expense tracker, the software {that a} small crew truly wants however that no SaaS product fairly addresses. Cat’s purpose and hope is that Claude Code helps folks “remedy their very own issues for themselves” and “stewards a brand new future of private software program.”

Product style as the brand new technical talent

Extra folks constructing extra software program is unambiguously good. Boris Cherny has even floated the concept that coding as we all know it’s “solved.” However what does that imply for the craft of software program engineering? Cat’s learn of the present second is extra nuanced:

I believe pre-AI, the abilities that have been essential have been having the ability to take a spec and implement it effectively. And I believe now the actually necessary talent is product style. Even for engineers. Can you utilize code to ingest a large quantity of consumer suggestions? Do you’ve got good instinct about which function to construct to deal with these wants, as a result of it’s typically totally different than precisely what customers are asking you for? After which, when Claude builds it, are you organising the fitting bar in order that what you ship folks truly love?

Cat’s not alone in highlighting the significance of style in a world the place code is a commodity. Steve Yegge, Wes McKinney, and plenty of others, myself included, see style and judgment as a uniquely human worth. This has sensible implications for a way engineers ought to spend their time now, and for what the subsequent era must study.

For junior engineers particularly, Cat described a development: Begin by utilizing Claude Code to grasp the codebase (ask all of the “dumb questions” with out embarrassment), take these solutions to a senior engineer for calibration, after which shut the loop by updating the CLAUDE.md with no matter was lacking.

Consider Claude Code as your intern that you just’re making an attempt to degree up. Like, train it again to Claude. Add a /confirm slash command. Put it within the CLAUDE.md or the agent README. Method this as senior engineers serving to you degree up, and you then serving to Claude and different brokers degree up.

The development course of, in different phrases, ought to be bidirectional. Engineers get higher at utilizing the instruments and the instruments get higher by way of the engineers’ accrued data. And considerably, this course of retains people firmly within the loop, enjoying a job that’s “lively, steady, and expert.”

You’ll be able to watch Cat and Addy’s full chat, plus all the pieces else from AI Codecon on the O’Reilly studying platform. Not a member? Join a free 10-day trial, no strings connected.

Previous articleHow To: 3D Print Customized Wall Mounts

Next articleThe place to seed your content material for max LLM pickup – Search Engine Watch

Everybody’s an Engineer Now – O’Reilly

The suggestions loop is itself a product

The bottleneck has shifted to evaluate

Everybody’s form of an engineer now

Product style as the brand new technical talent

Related Articles

The place to seed your content material for max LLM pickup – Search Engine Watch