I’ve long advocated that software engineers should read The Mythical Man-Month[0], but I believe it’s more important now than ever.
The last ~25 years or so have seen a drastic shift in how we build software, best trivialized by the shift from waterfall to agile.
With LLM-aided dev (Codex and Claude Code), I find myself going back to patterns that are closer to how we built software in the 70s/80s, than anything in my professional career (last ~15 years).
Some people are calling it “spec-driven development” but I find that title misleading.
Thinking about it as surgery is also misleading, though Fred Brooks’ analogy is still good.
For me, it feels like I’m finally able to spend time architecting the bridge/skyscraper/cathedral, without getting bogged down in terms of what bolts we’re using, where the steel come from, or which door hinges to use.
Those details matter, yes, but they’re the type of detail that I can delegate now; something that was far too expensive (and/or brittle) before.
There's not a lot that Brooks got wrong but the surgical team is it.
There's not a lot of team in a surgical team. Software does not need to be limited to one thing happening at a time the way open heart surgery does. There's room for more hands.
It's more like a sports team. But we don't practice, review, or coach the way they would, and so it takes a lot longer for us to reach excellence.
Have you ever seen or been to a serious surgery? The operating room is full of people, and those are some of the most gelled teams you'll find anywhere.
Brooks' idea is that the surgeon calls all the shots and everyone else is being orchestrated by that one person. There is only one person holding a knife at any point and that's generally the surgeon.
How many people are there besides the surgeon? About six? That's pretty much all one person can wrangle.
That's not enough for a large scale software project. We keep trying to make it work with Scrum teams but it is broken and we know it.
Most of the highest functioning projects I've been on have had a couple of 'surgeons'. And while they often work on separate 'patients', it's not always the case.
Aren't there some surgeries now where more than one surgeon is operating concurrently? All I can find is that there's an Insurance Code for it.
Can you go into a bit more detail on your perspective of the 70s/80s approach vs. today? I’m an analyst with a passion for engineering, but am not an engineer by trade. So honestly I am naive to how now would be different from the past.
My take is that 70/80's built programs from a set of blueprints of what was required. Where each programmer had a set of requirements, knew what were required, when it is needed to be completed by and the tools available to enable the next level in development. If someone lagged behind then the project halted until but the end result was solidity and a completed application. At least during that time other programmers could improve their work and document.
Meanwhile with agile, its always felt like a race to me. If you didn't complete your part then spend a whole week focusing on it while the others carry on with anticipation that the sprint will result in completion of the part required. Enabling for the next part they've built to be integrated.
Vibe coding offers this "make a text box write to a file" code generation and it does. However without any blueprints the code starts to crumble when you proceed to introduce middleware such as authentication.
It was never discussed on how authentication should authenticate because someone is already that far ahead in their work so you end up mashing more code together until it does work. It does and your product holds premise.
The we'll improve, document and fix later never comes because of the influx of feature requests leading it to bloat. Now bogged down in tech debt you then spend resources in wrangling the mess with senior engineers.
The senior engineers are now fixing the mess resulting in their experienced code not integrating with that of the juniors. The seniors now having to tidy that code leave behind the real code they were originally tasked in improving turning the whole codebase in to something that's a diabolical mess, but hey, it works.
Hardware is cheaper than refactoring so instead you then "scale" by throwing more hardware at it until it handles the pressure.
Someone then leaves and knowledge share is lost. Some exec promotes someone from the team who was crazy-sane enough to keep all in check to the senior role while skimping them on pay and are now handling the lost work, theirs and keeping the team in check.
The product starts to fail and new junior engineers are bought in with new naive wisdom, jumping up and down with the newest fancy library tech finally making the process complete causing it to repeat itself indefinitely.
> My take is that 70/80's built programs from a set of blueprints of what was required. Where each programmer had a set of requirements, knew what were required, when it is needed to be completed by and the tools available to enable the next level in development.
The thing is, you couldn't start dev until you had those blueprints. So that's a lot of time at the start of the project where development had to sit idle even if you had a good idea of what the architecture would at least be.
> If someone lagged behind then the project halted until but the end result was solidity and a completed application. At least during that time other programmers could improve their work and document.
No, you didn't get this. Apps that completed had bugs then too, whether in design, specification or implementation. It's why the waterfall paper basically said you'd have to built it twice either way, so you should try to accelerate building it the first time so you'd know what you messed up when you built it the second time.
Or as Mel Brooks, who wrote the Mythical Man-Month would say, "Build one to throw away; you will, anyhow."
Nor could programmers productively spend downtime simply document things, the documentation was supposed to have already been written by the time they were writing out punch cards. The "programming" had already been done, in principle, what remained was transcribing the processes and algorithms into COBOL or FORTRAN.
Startups are perfectly free to adopt the methods of the 70s if they wish, but they will be outcompeted and ground into dust in the process. Likewise, there is more to agile than Scrum (which is what you're describing with sprints), and it seems weird to describe the dread you'd get of blocking your team if it takes a week to do your part but act is if a week slip on the critical path in a waterfall effort is no big deal.
I mean, you're actually right that many (not all) waterfall-based teams treat it like it's no big deal, but that's the reason that waterfall projects were often disastrously over-time and over-budget. "We've already slipped 3 weeks to the right, what's another day?". Well, those add up... at least with agile you can more easily change the scope to fit the calendar, or adapt to changing market pressures, or more rapidly integrate learnings from user research.
More the 70s than 80s; our company wrote software in the mid 80s more or less the same as we (my company) still does; in the 80s-90s we just did ship a v1.0 a lot later than we do now, simply because in those times ages were basically impossible. Especially software on cardridges made it so you had to ship with 'no bugs'. But no blueprints; we just started on an idea we had and worked until it was good enough to ship.
> But no blueprints; we just started on an idea we had and worked until it was good enough to ship.
Yes, exactly. A lot of UNIX and other very good software for computers back then came about this way too. No or minimal blueprints and a lot of iterative implementation & testing and reacting to what you see.
It's hard convincing people today that agile methods have been in use long before sprints were a thing.
We're now getting in a position where we have CAD for software, aka CASE, Computer Aided Software Engineering. You can focus on the design of the software, instead of spending hours typing out code.
> Those details matter, yes, but they’re the type of detail that I can delegate now
No...
If you're building a skyscraper, there's no world where you can delegate where the steel or bolts come from. Or you'll at least need to care about what properties that exact steel has and guarantee every bit used on your project matches these constraints.
If you don't want to care about those, build residential houses with 1000x less constraints and can be rebuilt on a dime comparatively.
You might be thinking about interior decoration or floor arrangement ? Those were always a different matter left to the building owner to deal with.
In the world of construction there’s generally an owner, who then works with three groups: an architect, an engineer, and a general contractor.
Depending on what you’re building, you might start with an architect who brings on a preferred engineering firm, or a GC that brings on an architect, etc.
You’re right to question my bridge/bolt combo, as the bolts on a suspension bridge are certainly a key detail!
However, as a programmer, it feels like I used to spend way too much time doing the work of a subcontractor (electrical, plumbing, hvac, cement, etc.), unless I get lucky with a library that handles it for me (and that I trust).
Software creation, thus always felt like building a new cathedral, where I was both the architect and the stone mason, and everything in-between.
Now I can focus on the high-level, and contract out the minutia like a pre-fab bridge, quality American steel, and decorative hinges from Restoration Hardware, as long as they fit the requirements of the project.
No, it's perfectly apt. One comment is stating that using LLMs allows them to gloss over the details. The responding comment is saying that glossing over details is not a great idea, actually. I think that statement holds up very well on both sides of the analogy. You can get away with glossing over certain details when building a little shed or a throwaway python script. If you're building a skyscraper or a full-fledged application being used in the real world by thousands or millions of people, those details being glossed over are the foundation of your entire architecture, will influence every other part of the decision-making process, and will cause everything to crumble if handled carelessly.
A novice surgeon operating under the impression the nursing and anesthetics staff will help them if they make a mistake, will kill a patient very quickly.
Just because you can't necessarily do surgery effectively without these teams doesn't mean you don't need the senior surgeon to train you first (or senior surgeons to begin with).
And a bad anaesthetist or a nurse that obstructs the surgeon's view will kill a patient despite the quality of the surgeon, though the adept surgeon may manage to spot impending doom early and work around it.
The big problem comes when aspiring surgeons without the necessary experience think it's all small potatoes because they don't have to know much about which scalpel to use because the nurse will hand it to them anyway.
So yes, if the cost of killing unnecessary amounts of patients so that eventually you will learn to do surgery this way is fine, then by all means code like a surgeon from day one. Otherwise go to medschool first like the rest of us.
The author kindly informs us that he is a "UI prototyper [...] tinkering with design concepts" and also that he works for a company making AI coding software. This double-whammy may somewhat explain the strong Dunning-Kruger gravitational lensing observed when viewing the article from a distance.
Is Notion a company making "AI coding software" these days?
If you want to learn more about Geoffrey I suggest browsing through https://www.geoffreylitt.com/#projects - I've been following his work for a few years, "UI prototyper" is him under-selling himself.
This analogy is fundamentally flawed, both literally and metaphorically:
A surgeon isn’t a manager, they do the actual work! But
their skills and time are highly leveraged with a support
team that handles prep, secondary tasks, admin. The surgeon
focuses on the important stuff they are uniquely good at.
First, the literal.
Surgeons are managers of the operations they perform and heavily rely on the surgical team with which they work. If the author had any clue about surgeries, they would understand that the most important person in a major surgery is the anaesthesiologist, not the surgeon(s).
Second, the metaphorical.
The author goes to great lengths to identify "grunt work" as being "not the most intellectually fulfilling or creative part of the work." What they do not do is understand that there is no such thing as "grunt work" if, for any definition of work, it is valued without judgement.
But if a person identifies with being "the surgeon", with everyone else being "a support team that handles prep, secondary tasks, admin", then the post makes sense from an egocentric perspective.
I assume you haven't read The Mythical Man-Month[0]?
The author is referencing an existing analogy from Fred Brooks, and building upon it.
Sure, today the anesthesiologist might be the most "important" person in the room, but that's not the idea behind the analogy.
Your emphasis that surgeons "heavily rely on the surgical team" is just as important to Brooks' beliefs, in that the "Chief Programmer" is only able to do what they do via the support of the team.
The "grunt work" (noted by the author) seems solely focused on tasks given to the "Co-pilots" (or assistant programmers), notably with no specific mention to the other supporting roles (admin, editor, secretaries, clerk, toolsmith, tester, and "language lawyer"), many of which have been replaced by SaaS tooling (Github, Jira, Notion, Docusaurus, etc.) or filled by other roles (PMs, SDET, etc.).
Furthermore, the author even states:
> I hate the idea of giving all the grunt work to some lower-status members of the team. Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.
The author clearly sees less experienced programmers as mentees, rather than just some grunts whose work is beneath them.
The analogy may not be perfect, but their message about "AI coding tools" should be valued without judgement (and without accusations of egocentric thinking).
> the most important person in a major surgery is the anaesthesiologist, not the surgeon(s)
Could you explain more? It seems to me that, as sans surgeon there is no surgery, the surgeon is inevitably the most important person. Anæsthetic in its current form is a comparatively recent invention; historically major surgery was done without it, and in emergencies can still be done without it (at the cost of excruciating pain and far higher risk of negative outcomes, of course).
I learned about this recently, myself, and was surprised and ended up looking into it.
I read that the anesthesiologist is the person responsible for the patient during the surgery.
Apparently their role is:
- Provide continual medical assessment of the patient
- Monitor and control the patient’s vital life functions, including heart rate and rhythm, breathing, blood pressure, body temperature and body fluid balance
- Control the patient’s pain and level of consciousness to make conditions ideal for a safe and successful surgery
The gist I got from the other things I've read is that the anesthesiologist also has the most go/no-go responsibility before and during surgery.
Long ago I learned that anesthesiologists make more money than devs when starting out. Not a lot more, but most medical professionals do not.
When I asked why the answer was because it's the most dangerous job there. There's more opportunity to kill a patient with the anesthesia than any other means.
The main reason they don't want you to eat before surgery is that you can regurgitate and damage your lungs. But even if they solved that, the anesthesiologist's job is easier if you're in a fasting state both before and at the end of a surgery.
A quadruple bypass is not dangerous because you're stitching 4 new arteries onto a heart. It's dangerous because it takes so long to stitch 4 new arteries onto the heart that you're running up against the limits of how long you can safely keep someone sedated without causing life threatening complications.
I'm having trouble finding current statistics but at the time I was learning this, a double bypass was many times safer than a quadruple. Articles on bypass surgery understandably focus on the aspects that are within the patient's control.
> Experienced doctors, yes. Though I’m hearing some ridiculous salaries in SF.
There is no medical specialty except perhaps pediatrics/geriatrics where the pay will start below $200,000. There is a relatively modest effect of seniority on physician salaries (there is a huge amount of quality control/gatekeeping before one becomes an attending). This is nationwide, not in SF.
I'm not a developer but I don't think $300-400,000, normal salaries for fields like inpatient psychiatry or subspecialty medicine, are common for new developers, or even for any developer (vs a manager).
I thought the duration-related risk for that kind of surgery was based on how long the patient is put on a heart and lung machine? Naively I'd expect that to be riskier than the anesthesia.
I do know all that, but it still doesn’t really seem enough to qualify them as the most important person. Sure, they have the biggest power of veto, but without the surgeon, there is no surgery at all.
Remove the anæsthetist, and procedure forbids you from continuing. Remove the surgeon, and it’s impossible to continue. … Or remove the patient. I guess the patient is the most important person there after all! (Actually, that does a pretty good job of showing how the entire notion of “most important” may not make sense.)
this line of thinking is flawed - remove this tiny resistor from the motherboard and the computer will not boot -> this proves this tiny resistor is as important as the CPU.
> But if a person identifies with being "the surgeon", with everyone else being "a support team that handles prep, secondary tasks, admin", then the post makes sense from an egocentric perspective.
They're not talking about other people being the support team. They're calling the AI tooling the support team.
The anesthetist is the person who is primarily responsible for the patient remaining alive. You can decide for yourself whether that's more or less important than what the surgeon's doing.
As someone who has had to have 3 surgeries in the last few years I'm very grateful that both the surgeon and anesthetist in each case did a fantastic job and didn't do anything like the author of TFA is suggesting. FWIW I don't think the analogy in the article survives knowledge of what surgeons actually do.
The analogy should give you a hint of your kind of responsibility and work.
It may be flawed but most people get the point especially compared to vibe coders who just wait outside the operating room
I find that a lot of similar analogies are flawed.
On the landing page of one of the frameworks (I don't remember which one, unfortunately) there was a description comparing a programmer to a woodworker.
It was written that this woodworker, as a reliable and skilled craftsman, makes meticulously each piece of furniture with the same care, which isn't really true. For example, quite often the back panels remained unfinished, with traces of aggressive planing.
So the whole premise that "this framework will help you craft software as meticolously as woodworker crafts furniture" doesn't check out.
It's a nice analogy, and I think I'll use it in future.
If you want another one, think of painting. An "Old Master" painter like Rembrandt or Rubens or Botticelli would have had a large workshop with a team of assistants, who would not only do a lot of the work like stretching canvases or mixing the paints, but would also - under the master's direction - actually do a lot of the painting too. You might have the master sketch out the composition, and then paint the key faces (and, most of all, the eyes) and then the assistants would fill in areas like drapery, landscape, etc.
This changed in the Romantic period towards the end of the 1700s, with the idea of the individual artist, working alone in a moment of creative inspiration and producing a single work of genius from start to finish. Caspar David Friedrich or JMW Turner come to mind here.
Some programmers want to be Turner and control the whole work and feel their creativity is threatened if a machine can now do parts of it as well as they could. I'd rather be Rembrandt and sketch out the outline, paint the eyes, and leave the rest to junior engineers... or an AI Agent. It's a matter of preference.
> I'd rather be Rembrandt and sketch out the outline, paint the eyes, and leave the rest to junior engineers
What you’re not mentioning is that code isn’t and end product. It’s the blueprint for one. The end product is the process running and solving sone needs.
What makes software great is how easy it is to refine. The whole point of software engineering is to ensure confidence that the blueprint is good, and that the cost of changes is not enormous. It’s not about coding quickly, throw it over the wall and be done.
The process you outline would be like noting down a few riffs, fully composing a few minutes (measures?) and then have a few random people complete the full symphony. It’s not a matter of having a lot of music sheet, it’s a matter of having good music. The music sheet is important because it helps transmit the ideas to the conductor, who then trains the orchestra. But the audience doesn’t care about it.
So same, users don’t care about the code, but they do care about bugs and not having features. Acting on those feedbacks requires good code. If you can get good code with your process, it’s all good. Bit I’m still waiting for the proof.
I like this approach. After some months of gamely trying to involve Claude in my coding process, I find it's much more enjoyable and efficient for me to just write the code myself, rather than babysitting it and going over and over something it wrote finding errors and logical flaws. I was about to cancel my subscription.
This month, however, I had to start upgrading a very large db from MySQL 8 to 9, on a codebase that involves a few hundred long and complex queries. I wrote this code, so I know what they do and I understand the schema, but I'm also aware that some of them may violate ONLY_FULL_GROUP_BY, and a few (but can't remember which) had historical oddities unless DERIVED_MERGE was turned off. I was girding myself to run tests on dozens of suspects, but as a lark I handed Claude the whole codebase and the schema and asked it to identify any queries that might break if those two default options were enabled. Astonishingly, it did. It identified 15 queries and explained what they violated. It was wrong about 7 of them... those seven were actually fine, and it acknowledged that it had farmed this out in some way and that whatever it used for evaluation had been overly conservative and had returned false positives. But the other 8 queries were fixable. It was also a bit off about how to rewrite them - it created a lot of slow and inefficient workarounds. It neglected the idea of lateral joins, it didn't write SQL optimized for the indexes, etc. But it saved me hours and hours of locating these queries myself.
To me, this idea of letting the LLM render down a lot of code into potential pain points seems like a much more valuable use of it than asking it to write code itself. I kept my subscription for that reason.
This has been saving me a lot of time as well in a decade old code base. I can paste a stack trace and provide additional relevant context, then ask the LLM to do a first pass debug.
From that I usually get a list of file+lines to manually review, along with some initial leads to chase.
Another use case is when fixing performance issues. I can feature flag my fix and ask the model to confirm the new code path will produce the same result for a given set of inputs. We also have test coverage for this kind of thing, but the LLM can do a once-over and point out some flaws before I ever run those tests.
I haven’t gotten to the point where it writes much code for me beyond the auto-complete, which has been a modest boost in efficiency.
Yeah. As a debugging aid, I think it's fairly solid at surfacing things to look at and fix manually. And when you do that, you're actually hoping for more false positives than false negatives - which plays to the strengths of an LLM. When it comes to asking for rewrite suggestions for anything, I have to really go over its logic with a fine-tooth comb, because there are usually edge cases that can be spotted if you really think through it. I abhor its tendency to use try/catch. I've seen it write weird SQL joins that slow down queries by 30x. I'd never trust it to debug a race condition or to consider any side effects outside the 30 LoC it's currently looking at.
I guess I wouldn't trust it to confirm that new code would give the same result, but it can't hurt to ask, since if it told me the code wouldn't, that would make me look more closely at it.
I think as long as you look at it as part of a distillation process, and aim for false positives, and never actually trust it, it's good at helping to surface issues you may have missed.
I'm kinda surprised this isn't more popular. I figured we'd go this way eventually as we single out 10x-ers, give them a highly competent crew, and save a lot of money over your most expensive code monkey wasting time attending meetings, filling out Jira tickets, and giving presentations to the customer. You pay them a shitload of money - shouldn't you get every dollar's worth?
Honestly, at every job I spend an unreasonable amount of time getting up to speed on things that are only tangentially related to my job (No, here we need you to check all the boxes in the Jira ticket, ensure it's linked to a zephyr ticket, and ensure it's linked to a git PR - we don't care about you adding attachments or comments!)
Yikes. Don’t code the way this author thinks surgeons operate. All those “support” tasks are critically important and you likely couldn’t do them as well. Be humble and appreciate that all the stuff being done around you is also the important stuff. Support them just as much!
I really like Geoffrey Litt's new analogy for working with AI coding tools:
> Personally, I'm trying to code like a surgeon.
> A surgeon isn't a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.
It's also a neat callback to the Mythical Man Month, the most influential early textbook on large scale software engineering.
The surgical metaphor really resonates with my experience building real-time interview analysis systems. One key parallel I've found is that both surgery and complex software require careful state management and graceful error recovery.
When we first implemented real-time code analysis during interviews, we struggled with the "do no harm" principle - how to provide feedback without disrupting the candidate's flow. Our initial approach of running full AST analysis on every keystroke caused noticeable UI lag (150-200ms). We solved this by moving to a chunked analysis pattern with a 500ms debounce and incremental parsing. This reduced CPU usage by 70% while keeping perceived latency under 50ms.
The trickiest part was handling partial/invalid syntax states without crashing the analyzer. We ended up implementing a recovery mechanism similar to how modern IDEs handle incomplete code - maintaining a stack of valid states and rolling back on error rather than failing completely. Curious if others have found better approaches to real-time code analysis with incomplete input?
The role of the engineer in large enterprises with deep legacy systems is indeed quite like a surgeon, but not for the reason suggested. Rather, it involves delicate operations on a complex and confusing object that you didn’t design or build, that maybe nobody really understands in full, where simply locating the problem is more than half the job, where parts depend on other parts in unexpected ways, where one tiny mistake can have catastrophic consequences, and where a litany of credentials and clearances are required before they’ll even let you near the operating table.
> Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.
This and all the talk of status is a bit worrying
Junior devs are challenged by and have room to grow from tasks that would be routine and boring for a much more senior dev. The junior member might be completely out of their depth taking on a task that the senior dev would be challenged and grow from. This may influence who should do what tasks, although collaborating on a task is often an option
> My current goal with AI coding tools is to spend 100% of my time doing stuff that matters. (As a UI prototyper, that mostly means tinkering with design concepts.)
this struck me as weird. both in terms of “tinkering” being the most important thing to be doing, and then also describing “working like a surgeon” to be tinkering.
That isn't how analogies work--they are about partial similarities, not equivalence. The OP never says or implies that working like a surgeon is tinkering--allowing focus on the most important thing to be doing doesn't mean that the most important thing is the same for everyone.
Interestingly, I used the exact same metaphor the other day. Suggesting that tools like GitHub Speckit should have /wrap-up command:
> And then maybe "/wrap-up" to tie all the untied knots once you're sufficiently happy. Kinda like surgeon stepping aside after the core part of the operation.
Spent most of my time in thinking and asking Claude the right questions at the right moment instead of typing code. Review the code agent generated, let it run tests, deploy to PR branch for live debugging. Review console log and network traffic, paste the information to Cursor and ask Claude for the root cause, code paths and data flow. Solve issues one at a time.
It does feel like a surgeon working with a team of assistants. A lot of information, a lot of decisions, a lot of patience and focus.
If you code like a surgeon, feature/bug has to be completed during code surgery, the story points completed immediately (during surgery). In one sprint 3 years worth of features/bugs will be fixed. What will they do the remaining 3 years?
This post is no longer relevant with Codex and Claude 4.5. You don’t need to be the savior that does the highly specialized important work. You just need to come up with the design and specs you need, then you need to ensure it implements them. So, you act as architect and manager, not a surgeon.
- Good automated tests which the coding agent can run. I love pytest for this - one of my projects has 1500 tests and Claude Code is really good at selectively executing just tests relevant to the change it is making, and then running the whole suite at the end
- Give them the ability to interactively test the code they are writing too. Notes on how to start a development server (for web projects) are useful, then you can have them use Playwright or curl to try things out
- I'm having great results from maintaining a GitHub issues collection for projects and pasting URLs to issues directly into Claude Code
- I actually don't think documentation is too important: LLMs can read the code a lot faster than you to figure out how to use it. I have comprehensive documentation across all of my projects but I don't think it's the helpful for the coding agents, though they are good at helping me spot if it needs updating.
- Linters, type checkers, auto-formatters - give coding agents helpful tools to run and they'll use them.
For the most part anything that makes a codebase easier for humans to maintain turns out to help agents as well.
If your development team consists of autistic junior programmers with eidetic memory, then you damn well better make sure that your documentation is exceedingly thorough, absolutely unambiguous, and as restrictive as you can make it.
I sometimes use AI summaries to get the answers I need out of badly written documentation. That's about as far as I find any value or productivity boost.
Consider that this "surgeon" analogy has always been applicable when docs or books are better written and full of useful examples. Also consider that a lot of the annoying "plumbing code" you probably want AI for is fairly closed-ended as there are only so many combinations of API use possible.
I'm really not understanding the continued hype in 2025.
> How much time have you spent running a coding agent like Claude Code
I spent a month trying to get it to build ImGui apps in C++ and Python. I did ImGui because I wanted to try something that’s not totally mainstream. I cancelled before the second month.
It would essentially get stuck over and over and could never get itself out of the mud. In every case I had to either figure out what was wrong and explain it, or fix the code myself, which was mentally taxing after it makes a batch of changes and, because I don’t trust it, I have to sort of re-validate-from-scratch every time I need to dig it out of the mud.
In the end it wasn’t faster, and definitely wasn’t producing better quality. And I’m not particularly fast.
The best analogy I can come up with is it’s like the difference between building a SpaceX rocket, and making a cake that looks like a rocket. You probably think that’s absurd, no one would try to use a cake that looks like a rocket. But it’s a really good cake. It blows smoke and has a system of hydraulics and catapults that make it look like it’s taking off. And the current situation is, people look at the cake rocket and think, “it’s not perfect but it’s basically 70% there and we will figure out the rest before long”. Except you won’t. It’s a cake.
And then we point to the documentation, and tests it built as evidence it’s not cake. But those are cake too.
So then the question is, if you live in a simulation made of cake but it’s indistinguishable from reality, does it matter?
Ouch, sounds like its C++/ImGui abilities aren't up to snuff yet.
I tend to use it for Python, JavaScript and Go which are extremely well represented in its training data. I don't know enough C++ myself to have tried anything with that language yet! Sounds like I'd be dissapointed.
Because it is massively more productive than when you have to manually approve what it's doing. You can literally leave it running for an hour and come back to a fully baked solution (or an unholy mess depending on how the dice rolls go).
And OK, I get that my persistence in trying to help people understand this stuff can be annoying to people who have already decided that there's nothing here.
But in this case we have someone who looks like they are still operating on a 2024 model of how this stuff can be useful.
The "coding agents" category really does change things in very material ways. It only kicked off in February this year (when Claude Code released) and if you haven't yet had the "aha" moment it's easy to miss why it makes such a difference.
I'm not going to apologize for trying to help people understand why this matters. Give how much of a boost I'm getting from this stuff in my own work I honestly think it would be unethical for me not to share what I'm learning.
XD yes sure. I'd most definitely put those on the same level. Maybe even favor an UI prototyper if it comes down to the real deal. Who needs an open heart surgery when you can have a magnificent css-hover animation that really seals the deal on some 90%-AI-generated slob that only caters to delusional top-management completely out-of-touch with reality.
Irony off: Let's try it with a bit of humbleness next time, ey?
This article further emphasizes the obvious point that as programmers are well on the way to destroying their own profession and their work is poised to wreck the entire world, that it's time raise awareness that programmers work with much less discipline and responsibility than other professional, accredited, licensed trades, like say... doctors.
But for now, sure, why not just compare yourself to surgeons, as you already anoint yourselves as "engineers".
If we're going down this road, then I claim programmers are more skilled than engineers or doctors.
90% of engineers are not inventing new things--they are merely applying codified knowledge (which they didn't create) to a new location or instantiation. In contrast, every new program is unique in the world (if it were not then it wouldn't need to be written--you'd just copy/fork/download the program).
And don't get me started on doctors. More than 40,000 Americans die each year from medical errors[1]. You really think the casualty rate from programmers--with all our mistakes--can beat that? I don't think so.
So, yeah, maybe surgeons are not the right model. Maybe surgeons could learn a thing or two from programmers.
Sometimes I wonder how many people die indirectly from specific decisions. Like, large A-pillars in cars definitely sometimes save people when the car rolls over, but how many people die each year because a large A-pillar makes a blind spot and the driver hits a pedestrian? How many people die each year due to traffic slowing down ambulances?
People dying directly due to software bugs (e.g. Therac-25) is pretty rare, but what about an inefficient network that caused congestion and made a heimlich maneuver youtube video load slightly too slowly to save someone from choking to death? I don't think there's any way to measure it, and it's almost certainly important to train surgeons more than software engineers, but I still do wonder.
I agree with you: indirect causality is what kills people. Directly attributable causes are rare because they are easy to fix.
I also suspect these indirect causes are not easy to fix. It's not like 1 or 2 bugs cause 10,000 indirect deaths. It's more like 10,000 different bugs/design flaws cause 1 or 2 indirect deaths each.
> 90% of engineers are not inventing new things ...
That raises the question of just how much of a difference you need for something to be an invention. Merely copying another program is not invention, while coming up with and implementing TCP definitely is. But is implementing another instance of a CRUD app invention? Is configuring a server invention? What about if the configuration is written by a script? The dividing line seems harder to pin down than one might at first think.
Is another CRUD app any different than another static-force analysis on a house beam? How many engineering decisions are made by software running the math and spitting out numbers?
I agree that there are degrees of invention in both, but my argument (which is, admittedly, emotional in nature) is that there are more unique decisions in software, mostly because if you don't need to decide--if there is already an algorithm for it--then you can just plug in an existing library/app.
You only need a programmer when you can't just run an existing app. Therefore, all programmers create things that don't already exist. That's the definition of invention.
I know I'm exaggerating here, but it's less of an exaggeration than saying that programmers are "poised to wreck the entire world".
> You really think the casualty rate from programmers--with all our mistakes--can beat that?
I think that if my infrastructure + code had any direct connection to patient outcome, there would be a lot of harm done. Not that I'm particular bad at either, but I know the effective cost of errors is minimal, and certainly does not have a direct impact on people's health. If I had the same responsibilities as a surgeon, I'd have a much slower rate of change in my systems.
I do not in any way believe that the fact that we in IT kill fewer people than surgeons has any meaning for whether we're more skilled than doctors.
My comment is really an emotional reaction to the (very common) denigration of software engineers, so don't take it too seriously.
But I also think that good software engineers can scale the reliability of their software to fit the purpose. There is a ton of software in medical devices, and despite the well-publicized fatal bugs, 99.9% of medical software works without error.
In fact, the automation of things like drug dispensing at pharmacies has decreased mistakes. I think if you deleted all the medical software in the world, deaths would increase.
This is a good perspective. In most disciplines where they need some level of repeatable quality, like medicine or construction, they have constrained the environment to such a degree that they can engage in somewhat repeatable work.
Consider how well cars would work if they had to traverse arbitrary terrain. Instead, we have paved the earth to make their usage somewhat consistent.
Construction projects repeat the same tasks over and over for decades. Surgeons perform the same surgeries for decades. But in software if something is repeatable and useful, it becomes an app or a library or a framework.
Some things in software have been heavily constrained to make the terrain navigable. Cloud computing in containers is one example. But in general, there’s a much higher degree of navigating uncharted territory in software than these other fields. Even building a CRUD app is often wildly complex, not the CRUD part, but the mapping of a specific business’s processes to that CRUD app, and getting those specific employees who currently work there to use it, is itself quite a novel undertaking.
A surgeon has 4 years of undergraduate education, 4 years of medical school, and a 5 year residency, learning to operate (pun intended) with other equally highly trained specialists, many of whom are peers, like anesthesiologists, not merely support. The comparison was already dubious when Brooks made it for operating systems programming. Setting up a comparison with the average "I don't use anything I learned in my CS degree, lol" coder wrangling a chorus of hallucinating stochastic parrots is a bonkers level of hubris from techbros.
Most surgeons don't use anything they learn in medical school or residency either. It's usually their last 2 - 4 years (depending on whether they did a fellowship) that is useful in their day to day job. E.g. an eye surgeon doesn't need to know how to read an ECG
I’ve long advocated that software engineers should read The Mythical Man-Month[0], but I believe it’s more important now than ever.
The last ~25 years or so have seen a drastic shift in how we build software, best trivialized by the shift from waterfall to agile.
With LLM-aided dev (Codex and Claude Code), I find myself going back to patterns that are closer to how we built software in the 70s/80s, than anything in my professional career (last ~15 years).
Some people are calling it “spec-driven development” but I find that title misleading.
Thinking about it as surgery is also misleading, though Fred Brooks’ analogy is still good.
For me, it feels like I’m finally able to spend time architecting the bridge/skyscraper/cathedral, without getting bogged down in terms of what bolts we’re using, where the steel come from, or which door hinges to use.
Those details matter, yes, but they’re the type of detail that I can delegate now; something that was far too expensive (and/or brittle) before.
[0]https://en.wikipedia.org/wiki/The_Mythical_Man-Month
There's not a lot that Brooks got wrong but the surgical team is it.
There's not a lot of team in a surgical team. Software does not need to be limited to one thing happening at a time the way open heart surgery does. There's room for more hands.
It's more like a sports team. But we don't practice, review, or coach the way they would, and so it takes a lot longer for us to reach excellence.
> There's not a lot of team in a surgical team
Surgeon, assistant, tech, anesthesiologist, plus probably a nurse...?
Have you ever seen or been to a serious surgery? The operating room is full of people, and those are some of the most gelled teams you'll find anywhere.
Brooks' idea is that the surgeon calls all the shots and everyone else is being orchestrated by that one person. There is only one person holding a knife at any point and that's generally the surgeon.
How many people are there besides the surgeon? About six? That's pretty much all one person can wrangle.
That's not enough for a large scale software project. We keep trying to make it work with Scrum teams but it is broken and we know it.
Most of the highest functioning projects I've been on have had a couple of 'surgeons'. And while they often work on separate 'patients', it's not always the case.
Aren't there some surgeries now where more than one surgeon is operating concurrently? All I can find is that there's an Insurance Code for it.
Can you go into a bit more detail on your perspective of the 70s/80s approach vs. today? I’m an analyst with a passion for engineering, but am not an engineer by trade. So honestly I am naive to how now would be different from the past.
My take is that 70/80's built programs from a set of blueprints of what was required. Where each programmer had a set of requirements, knew what were required, when it is needed to be completed by and the tools available to enable the next level in development. If someone lagged behind then the project halted until but the end result was solidity and a completed application. At least during that time other programmers could improve their work and document.
Meanwhile with agile, its always felt like a race to me. If you didn't complete your part then spend a whole week focusing on it while the others carry on with anticipation that the sprint will result in completion of the part required. Enabling for the next part they've built to be integrated.
Vibe coding offers this "make a text box write to a file" code generation and it does. However without any blueprints the code starts to crumble when you proceed to introduce middleware such as authentication.
It was never discussed on how authentication should authenticate because someone is already that far ahead in their work so you end up mashing more code together until it does work. It does and your product holds premise.
The we'll improve, document and fix later never comes because of the influx of feature requests leading it to bloat. Now bogged down in tech debt you then spend resources in wrangling the mess with senior engineers.
The senior engineers are now fixing the mess resulting in their experienced code not integrating with that of the juniors. The seniors now having to tidy that code leave behind the real code they were originally tasked in improving turning the whole codebase in to something that's a diabolical mess, but hey, it works.
Hardware is cheaper than refactoring so instead you then "scale" by throwing more hardware at it until it handles the pressure.
Someone then leaves and knowledge share is lost. Some exec promotes someone from the team who was crazy-sane enough to keep all in check to the senior role while skimping them on pay and are now handling the lost work, theirs and keeping the team in check.
The product starts to fail and new junior engineers are bought in with new naive wisdom, jumping up and down with the newest fancy library tech finally making the process complete causing it to repeat itself indefinitely.
> My take is that 70/80's built programs from a set of blueprints of what was required. Where each programmer had a set of requirements, knew what were required, when it is needed to be completed by and the tools available to enable the next level in development.
The thing is, you couldn't start dev until you had those blueprints. So that's a lot of time at the start of the project where development had to sit idle even if you had a good idea of what the architecture would at least be.
> If someone lagged behind then the project halted until but the end result was solidity and a completed application. At least during that time other programmers could improve their work and document.
No, you didn't get this. Apps that completed had bugs then too, whether in design, specification or implementation. It's why the waterfall paper basically said you'd have to built it twice either way, so you should try to accelerate building it the first time so you'd know what you messed up when you built it the second time.
Or as Mel Brooks, who wrote the Mythical Man-Month would say, "Build one to throw away; you will, anyhow."
Nor could programmers productively spend downtime simply document things, the documentation was supposed to have already been written by the time they were writing out punch cards. The "programming" had already been done, in principle, what remained was transcribing the processes and algorithms into COBOL or FORTRAN.
Startups are perfectly free to adopt the methods of the 70s if they wish, but they will be outcompeted and ground into dust in the process. Likewise, there is more to agile than Scrum (which is what you're describing with sprints), and it seems weird to describe the dread you'd get of blocking your team if it takes a week to do your part but act is if a week slip on the critical path in a waterfall effort is no big deal.
I mean, you're actually right that many (not all) waterfall-based teams treat it like it's no big deal, but that's the reason that waterfall projects were often disastrously over-time and over-budget. "We've already slipped 3 weeks to the right, what's another day?". Well, those add up... at least with agile you can more easily change the scope to fit the calendar, or adapt to changing market pressures, or more rapidly integrate learnings from user research.
I imagine The Mythical Man Month would have been much more entertaining if written by Mel Brooks ;-)
More the 70s than 80s; our company wrote software in the mid 80s more or less the same as we (my company) still does; in the 80s-90s we just did ship a v1.0 a lot later than we do now, simply because in those times ages were basically impossible. Especially software on cardridges made it so you had to ship with 'no bugs'. But no blueprints; we just started on an idea we had and worked until it was good enough to ship.
> But no blueprints; we just started on an idea we had and worked until it was good enough to ship.
Yes, exactly. A lot of UNIX and other very good software for computers back then came about this way too. No or minimal blueprints and a lot of iterative implementation & testing and reacting to what you see.
It's hard convincing people today that agile methods have been in use long before sprints were a thing.
We're now getting in a position where we have CAD for software, aka CASE, Computer Aided Software Engineering. You can focus on the design of the software, instead of spending hours typing out code.
The question is: is that an AI thing or just the domain of conventional devtools.
> bridge/skyscraper/cathedral
> Those details matter, yes, but they’re the type of detail that I can delegate now
No...
If you're building a skyscraper, there's no world where you can delegate where the steel or bolts come from. Or you'll at least need to care about what properties that exact steel has and guarantee every bit used on your project matches these constraints.
If you don't want to care about those, build residential houses with 1000x less constraints and can be rebuilt on a dime comparatively.
You might be thinking about interior decoration or floor arrangement ? Those were always a different matter left to the building owner to deal with.
In the world of construction there’s generally an owner, who then works with three groups: an architect, an engineer, and a general contractor.
Depending on what you’re building, you might start with an architect who brings on a preferred engineering firm, or a GC that brings on an architect, etc.
You’re right to question my bridge/bolt combo, as the bolts on a suspension bridge are certainly a key detail!
However, as a programmer, it feels like I used to spend way too much time doing the work of a subcontractor (electrical, plumbing, hvac, cement, etc.), unless I get lucky with a library that handles it for me (and that I trust).
Software creation, thus always felt like building a new cathedral, where I was both the architect and the stone mason, and everything in-between.
Now I can focus on the high-level, and contract out the minutia like a pre-fab bridge, quality American steel, and decorative hinges from Restoration Hardware, as long as they fit the requirements of the project.
I think you're taking a metaphor a bit too literally.
No, it's perfectly apt. One comment is stating that using LLMs allows them to gloss over the details. The responding comment is saying that glossing over details is not a great idea, actually. I think that statement holds up very well on both sides of the analogy. You can get away with glossing over certain details when building a little shed or a throwaway python script. If you're building a skyscraper or a full-fledged application being used in the real world by thousands or millions of people, those details being glossed over are the foundation of your entire architecture, will influence every other part of the decision-making process, and will cause everything to crumble if handled carelessly.
Look at my beautiful cathedral
Look at my cathedral of tailwind ui
I’m sure they put locks on the doors
> If you're building a skyscraper, there's no world where you can delegate where the steel or bolts come from.
ah yes, I'm sure the CEO of Walsh (https://www.walshgroup.com/ourexperience/building/highrisere...) picks each bolt themselves directly without delegation
That's actually a good analogy.
A novice surgeon operating under the impression the nursing and anesthetics staff will help them if they make a mistake, will kill a patient very quickly.
Just because you can't necessarily do surgery effectively without these teams doesn't mean you don't need the senior surgeon to train you first (or senior surgeons to begin with).
And a bad anaesthetist or a nurse that obstructs the surgeon's view will kill a patient despite the quality of the surgeon, though the adept surgeon may manage to spot impending doom early and work around it.
The big problem comes when aspiring surgeons without the necessary experience think it's all small potatoes because they don't have to know much about which scalpel to use because the nurse will hand it to them anyway.
So yes, if the cost of killing unnecessary amounts of patients so that eventually you will learn to do surgery this way is fine, then by all means code like a surgeon from day one. Otherwise go to medschool first like the rest of us.
The author kindly informs us that he is a "UI prototyper [...] tinkering with design concepts" and also that he works for a company making AI coding software. This double-whammy may somewhat explain the strong Dunning-Kruger gravitational lensing observed when viewing the article from a distance.
Is Notion a company making "AI coding software" these days?
If you want to learn more about Geoffrey I suggest browsing through https://www.geoffreylitt.com/#projects - I've been following his work for a few years, "UI prototyper" is him under-selling himself.
This analogy is fundamentally flawed, both literally and metaphorically:
First, the literal.Surgeons are managers of the operations they perform and heavily rely on the surgical team with which they work. If the author had any clue about surgeries, they would understand that the most important person in a major surgery is the anaesthesiologist, not the surgeon(s).
Second, the metaphorical.
The author goes to great lengths to identify "grunt work" as being "not the most intellectually fulfilling or creative part of the work." What they do not do is understand that there is no such thing as "grunt work" if, for any definition of work, it is valued without judgement.
But if a person identifies with being "the surgeon", with everyone else being "a support team that handles prep, secondary tasks, admin", then the post makes sense from an egocentric perspective.
I assume you haven't read The Mythical Man-Month[0]?
The author is referencing an existing analogy from Fred Brooks, and building upon it.
Sure, today the anesthesiologist might be the most "important" person in the room, but that's not the idea behind the analogy.
Your emphasis that surgeons "heavily rely on the surgical team" is just as important to Brooks' beliefs, in that the "Chief Programmer" is only able to do what they do via the support of the team.
The "grunt work" (noted by the author) seems solely focused on tasks given to the "Co-pilots" (or assistant programmers), notably with no specific mention to the other supporting roles (admin, editor, secretaries, clerk, toolsmith, tester, and "language lawyer"), many of which have been replaced by SaaS tooling (Github, Jira, Notion, Docusaurus, etc.) or filled by other roles (PMs, SDET, etc.).
Furthermore, the author even states:
> I hate the idea of giving all the grunt work to some lower-status members of the team. Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.
The author clearly sees less experienced programmers as mentees, rather than just some grunts whose work is beneath them.
The analogy may not be perfect, but their message about "AI coding tools" should be valued without judgement (and without accusations of egocentric thinking).
[0]https://en.wikipedia.org/wiki/The_Mythical_Man-Month
> the most important person in a major surgery is the anaesthesiologist, not the surgeon(s)
Could you explain more? It seems to me that, as sans surgeon there is no surgery, the surgeon is inevitably the most important person. Anæsthetic in its current form is a comparatively recent invention; historically major surgery was done without it, and in emergencies can still be done without it (at the cost of excruciating pain and far higher risk of negative outcomes, of course).
I learned about this recently, myself, and was surprised and ended up looking into it.
I read that the anesthesiologist is the person responsible for the patient during the surgery.
Apparently their role is:
- Provide continual medical assessment of the patient
- Monitor and control the patient’s vital life functions, including heart rate and rhythm, breathing, blood pressure, body temperature and body fluid balance
- Control the patient’s pain and level of consciousness to make conditions ideal for a safe and successful surgery
The gist I got from the other things I've read is that the anesthesiologist also has the most go/no-go responsibility before and during surgery.
Source: https://www.medschool.umaryland.edu/anesthesiology/patient-i...
Edit: I recommend reading more about the anesthesiologist's role. I found it interesting even as an entirely casual observer.
Long ago I learned that anesthesiologists make more money than devs when starting out. Not a lot more, but most medical professionals do not.
When I asked why the answer was because it's the most dangerous job there. There's more opportunity to kill a patient with the anesthesia than any other means.
The main reason they don't want you to eat before surgery is that you can regurgitate and damage your lungs. But even if they solved that, the anesthesiologist's job is easier if you're in a fasting state both before and at the end of a surgery.
A quadruple bypass is not dangerous because you're stitching 4 new arteries onto a heart. It's dangerous because it takes so long to stitch 4 new arteries onto the heart that you're running up against the limits of how long you can safely keep someone sedated without causing life threatening complications.
I'm having trouble finding current statistics but at the time I was learning this, a double bypass was many times safer than a quadruple. Articles on bypass surgery understandably focus on the aspects that are within the patient's control.
I’m not sure where you’re getting your salary data, but doctors routinely earn more than developers
Experienced doctors, yes. Though I’m hearing some ridiculous salaries in SF.
> Long ago
> Experienced doctors, yes. Though I’m hearing some ridiculous salaries in SF.
There is no medical specialty except perhaps pediatrics/geriatrics where the pay will start below $200,000. There is a relatively modest effect of seniority on physician salaries (there is a huge amount of quality control/gatekeeping before one becomes an attending). This is nationwide, not in SF.
Here are data in SF: https://www.doximity.com/reports/physician-compensation-repo...
And another source that reflects, purportedly, the data used in the above: https://www.instagram.com/reel/DL6MT-ZMJ7K/
I'm not a developer but I don't think $300-400,000, normal salaries for fields like inpatient psychiatry or subspecialty medicine, are common for new developers, or even for any developer (vs a manager).
I thought the duration-related risk for that kind of surgery was based on how long the patient is put on a heart and lung machine? Naively I'd expect that to be riskier than the anesthesia.
How is "the patient is put on a heart and lung machine" not the field of anaesthesia? That's exactly what it is.
I do know all that, but it still doesn’t really seem enough to qualify them as the most important person. Sure, they have the biggest power of veto, but without the surgeon, there is no surgery at all.
Remove the anæsthetist, and procedure forbids you from continuing. Remove the surgeon, and it’s impossible to continue. … Or remove the patient. I guess the patient is the most important person there after all! (Actually, that does a pretty good job of showing how the entire notion of “most important” may not make sense.)
Without the surgeon, you just get a nurse or a butcher or someone who stayed at a Holiday Inn last night.
It's about as reasonable as doing surgery without an anesthesiologist.
Probably more so really.
Yeah... I guess at least your patient isn't awake to deal with whatever horror is happening.
>Sure, they have the biggest power of veto, but without the surgeon, there is no surgery at all.
And without anaesthesiologist there is no surgery at all either for 90% of the cases.
Id rather have a nurse have a go than to do this awake without sedatives, without anyone monitoring me. Just think about this for a second.
100% > 90%
this line of thinking is flawed - remove this tiny resistor from the motherboard and the computer will not boot -> this proves this tiny resistor is as important as the CPU.
Can build a computer without that resistor?
The logic is about what is absolutely necessary to achieve what you try to achieve
What makes the anaesthesiologist more important?
> But if a person identifies with being "the surgeon", with everyone else being "a support team that handles prep, secondary tasks, admin", then the post makes sense from an egocentric perspective.
They're not talking about other people being the support team. They're calling the AI tooling the support team.
The anesthetist is the person who is primarily responsible for the patient remaining alive. You can decide for yourself whether that's more or less important than what the surgeon's doing.
As someone who has had to have 3 surgeries in the last few years I'm very grateful that both the surgeon and anesthetist in each case did a fantastic job and didn't do anything like the author of TFA is suggesting. FWIW I don't think the analogy in the article survives knowledge of what surgeons actually do.
The analogy should give you a hint of your kind of responsibility and work. It may be flawed but most people get the point especially compared to vibe coders who just wait outside the operating room
I find that a lot of similar analogies are flawed.
On the landing page of one of the frameworks (I don't remember which one, unfortunately) there was a description comparing a programmer to a woodworker.
It was written that this woodworker, as a reliable and skilled craftsman, makes meticulously each piece of furniture with the same care, which isn't really true. For example, quite often the back panels remained unfinished, with traces of aggressive planing.
So the whole premise that "this framework will help you craft software as meticolously as woodworker crafts furniture" doesn't check out.
If an analogy isn't flawed it's not an analogy, it's a description.
Does the analogy say the surgeon is the most important person? The surgeon does the work.
You can even do an operation without the most important person but hardly without someone taking the position of the surgeon.
Ask https://en.wikipedia.org/wiki/Leonid_Rogozov
It's a nice analogy, and I think I'll use it in future.
If you want another one, think of painting. An "Old Master" painter like Rembrandt or Rubens or Botticelli would have had a large workshop with a team of assistants, who would not only do a lot of the work like stretching canvases or mixing the paints, but would also - under the master's direction - actually do a lot of the painting too. You might have the master sketch out the composition, and then paint the key faces (and, most of all, the eyes) and then the assistants would fill in areas like drapery, landscape, etc.
This changed in the Romantic period towards the end of the 1700s, with the idea of the individual artist, working alone in a moment of creative inspiration and producing a single work of genius from start to finish. Caspar David Friedrich or JMW Turner come to mind here.
Some programmers want to be Turner and control the whole work and feel their creativity is threatened if a machine can now do parts of it as well as they could. I'd rather be Rembrandt and sketch out the outline, paint the eyes, and leave the rest to junior engineers... or an AI Agent. It's a matter of preference.
> I'd rather be Rembrandt and sketch out the outline, paint the eyes, and leave the rest to junior engineers
What you’re not mentioning is that code isn’t and end product. It’s the blueprint for one. The end product is the process running and solving sone needs.
What makes software great is how easy it is to refine. The whole point of software engineering is to ensure confidence that the blueprint is good, and that the cost of changes is not enormous. It’s not about coding quickly, throw it over the wall and be done.
The process you outline would be like noting down a few riffs, fully composing a few minutes (measures?) and then have a few random people complete the full symphony. It’s not a matter of having a lot of music sheet, it’s a matter of having good music. The music sheet is important because it helps transmit the ideas to the conductor, who then trains the orchestra. But the audience doesn’t care about it.
So same, users don’t care about the code, but they do care about bugs and not having features. Acting on those feedbacks requires good code. If you can get good code with your process, it’s all good. Bit I’m still waiting for the proof.
I like this approach. After some months of gamely trying to involve Claude in my coding process, I find it's much more enjoyable and efficient for me to just write the code myself, rather than babysitting it and going over and over something it wrote finding errors and logical flaws. I was about to cancel my subscription.
This month, however, I had to start upgrading a very large db from MySQL 8 to 9, on a codebase that involves a few hundred long and complex queries. I wrote this code, so I know what they do and I understand the schema, but I'm also aware that some of them may violate ONLY_FULL_GROUP_BY, and a few (but can't remember which) had historical oddities unless DERIVED_MERGE was turned off. I was girding myself to run tests on dozens of suspects, but as a lark I handed Claude the whole codebase and the schema and asked it to identify any queries that might break if those two default options were enabled. Astonishingly, it did. It identified 15 queries and explained what they violated. It was wrong about 7 of them... those seven were actually fine, and it acknowledged that it had farmed this out in some way and that whatever it used for evaluation had been overly conservative and had returned false positives. But the other 8 queries were fixable. It was also a bit off about how to rewrite them - it created a lot of slow and inefficient workarounds. It neglected the idea of lateral joins, it didn't write SQL optimized for the indexes, etc. But it saved me hours and hours of locating these queries myself.
To me, this idea of letting the LLM render down a lot of code into potential pain points seems like a much more valuable use of it than asking it to write code itself. I kept my subscription for that reason.
This has been saving me a lot of time as well in a decade old code base. I can paste a stack trace and provide additional relevant context, then ask the LLM to do a first pass debug.
From that I usually get a list of file+lines to manually review, along with some initial leads to chase.
Another use case is when fixing performance issues. I can feature flag my fix and ask the model to confirm the new code path will produce the same result for a given set of inputs. We also have test coverage for this kind of thing, but the LLM can do a once-over and point out some flaws before I ever run those tests.
I haven’t gotten to the point where it writes much code for me beyond the auto-complete, which has been a modest boost in efficiency.
Yeah. As a debugging aid, I think it's fairly solid at surfacing things to look at and fix manually. And when you do that, you're actually hoping for more false positives than false negatives - which plays to the strengths of an LLM. When it comes to asking for rewrite suggestions for anything, I have to really go over its logic with a fine-tooth comb, because there are usually edge cases that can be spotted if you really think through it. I abhor its tendency to use try/catch. I've seen it write weird SQL joins that slow down queries by 30x. I'd never trust it to debug a race condition or to consider any side effects outside the 30 LoC it's currently looking at.
I guess I wouldn't trust it to confirm that new code would give the same result, but it can't hurt to ask, since if it told me the code wouldn't, that would make me look more closely at it.
I think as long as you look at it as part of a distillation process, and aim for false positives, and never actually trust it, it's good at helping to surface issues you may have missed.
This reminded me of a slide from a Dan North talk - perhaps this one https://dannorth.net/talks/#software-faster? One of those anyway.
The key quote was something like "You want your software to be like surgery - as little of it as possible to fix your problem".
Anyway, it doesn't seem like this blog post is following that vibe.
I like this quote.
Unfortunately, my predecessor at work followed a different principle - "copy paste a whole file if it saves you 5 minutes today".
Well, I am still a surgeon, I just do a lot of amputations.
So the Chief Programmer Team structure [0] is back in fashion is it.
But this time with agents.
Fred Brooks has never been more relevant.
[0]: https://en.wikipedia.org/wiki/Chief_programmer_team
Yes, I cite Brooks (and Harlan Mills, seemingly the original source of the idea) in the post!
I’m just glad I’m not the only one revisiting past structures that fell apart at the time because they involved humans.
Now we have human like automation, everything needs revisiting.
I'm kinda surprised this isn't more popular. I figured we'd go this way eventually as we single out 10x-ers, give them a highly competent crew, and save a lot of money over your most expensive code monkey wasting time attending meetings, filling out Jira tickets, and giving presentations to the customer. You pay them a shitload of money - shouldn't you get every dollar's worth?
Honestly, at every job I spend an unreasonable amount of time getting up to speed on things that are only tangentially related to my job (No, here we need you to check all the boxes in the Jira ticket, ensure it's linked to a zephyr ticket, and ensure it's linked to a git PR - we don't care about you adding attachments or comments!)
Wait, are you parodying Madonna or meta-parodying Weird Al?
Please no, it was bad enough getting Zappa stuck in my head whenever I had to wrangle DynamoDB.
Sorry!
Awareness of the autonomy spectrum, or as I say, delegation spectrum, seems to be the hardest thing for adopting ai coding assistants effectively
I think maybe it’s because engineers aren’t used to delegating, because it seems like founders tend to pick it up faster than career engineers
> The surgeon focuses on the important stuff
Yikes. Don’t code the way this author thinks surgeons operate. All those “support” tasks are critically important and you likely couldn’t do them as well. Be humble and appreciate that all the stuff being done around you is also the important stuff. Support them just as much!
I really like Geoffrey Litt's new analogy for working with AI coding tools:
> Personally, I'm trying to code like a surgeon.
> A surgeon isn't a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.
It's also a neat callback to the Mythical Man Month, the most influential early textbook on large scale software engineering.
I'm somewhat of a prompt surgeon myself. I find prompts online and then hash them together to fit my needs.
frankensteinian
Modern surgeons evolved from barbers and butchers.
Code surgeons, where are they now on the evolution path? Barbers? Butchers?
The surgical metaphor really resonates with my experience building real-time interview analysis systems. One key parallel I've found is that both surgery and complex software require careful state management and graceful error recovery.
When we first implemented real-time code analysis during interviews, we struggled with the "do no harm" principle - how to provide feedback without disrupting the candidate's flow. Our initial approach of running full AST analysis on every keystroke caused noticeable UI lag (150-200ms). We solved this by moving to a chunked analysis pattern with a 500ms debounce and incremental parsing. This reduced CPU usage by 70% while keeping perceived latency under 50ms.
The trickiest part was handling partial/invalid syntax states without crashing the analyzer. We ended up implementing a recovery mechanism similar to how modern IDEs handle incomplete code - maintaining a stack of valid states and rolling back on error rather than failing completely. Curious if others have found better approaches to real-time code analysis with incomplete input?
The role of the engineer in large enterprises with deep legacy systems is indeed quite like a surgeon, but not for the reason suggested. Rather, it involves delicate operations on a complex and confusing object that you didn’t design or build, that maybe nobody really understands in full, where simply locating the problem is more than half the job, where parts depend on other parts in unexpected ways, where one tiny mistake can have catastrophic consequences, and where a litany of credentials and clearances are required before they’ll even let you near the operating table.
> Yes, junior members will often have more grunt work, but they should also be given many interesting tasks to help them grow.
This and all the talk of status is a bit worrying
Junior devs are challenged by and have room to grow from tasks that would be routine and boring for a much more senior dev. The junior member might be completely out of their depth taking on a task that the senior dev would be challenged and grow from. This may influence who should do what tasks, although collaborating on a task is often an option
> My current goal with AI coding tools is to spend 100% of my time doing stuff that matters. (As a UI prototyper, that mostly means tinkering with design concepts.)
this struck me as weird. both in terms of “tinkering” being the most important thing to be doing, and then also describing “working like a surgeon” to be tinkering.
That isn't how analogies work--they are about partial similarities, not equivalence. The OP never says or implies that working like a surgeon is tinkering--allowing focus on the most important thing to be doing doesn't mean that the most important thing is the same for everyone.
Yeah, if an analogy is an exact match it's not an analogy any more.
I keep reading this as "code like a sturgeon"
(They're freshwater fish, so I don't think they like to work in C very much.)
Interestingly, I used the exact same metaphor the other day. Suggesting that tools like GitHub Speckit should have /wrap-up command:
> And then maybe "/wrap-up" to tie all the untied knots once you're sufficiently happy. Kinda like surgeon stepping aside after the core part of the operation.
This is what I am doing these days.
Spent most of my time in thinking and asking Claude the right questions at the right moment instead of typing code. Review the code agent generated, let it run tests, deploy to PR branch for live debugging. Review console log and network traffic, paste the information to Cursor and ask Claude for the root cause, code paths and data flow. Solve issues one at a time.
It does feel like a surgeon working with a team of assistants. A lot of information, a lot of decisions, a lot of patience and focus.
If you code like a surgeon, feature/bug has to be completed during code surgery, the story points completed immediately (during surgery). In one sprint 3 years worth of features/bugs will be fixed. What will they do the remaining 3 years?
It looks like the future is already here.
I'd like to officially coin the term "Trad Engineering" and/or "Trad Coding".
Yes, code like someone might die if you make a mistake!
Can you imagine a surgeon using Claude Scalpel as an agent to just go ahead and fix that one artery?
Yes, I know several careless surgeons who don't check their own work
I'm always finding sponges and gloves that were left inside our patient after previous operations.
Maybe it saves time if you know you're gonna have to reopen the patient later?
"Oh good, my tools are here."
This post is no longer relevant with Codex and Claude 4.5. You don’t need to be the savior that does the highly specialized important work. You just need to come up with the design and specs you need, then you need to ensure it implements them. So, you act as architect and manager, not a surgeon.
Isnt surgeon plan the surgery before hand with their peers? I think thats similiar with the design and specs in softeng.
I also still need to monitor what Claude do in changes, not completely delegates all code.
The analogy I have used is “AI as sous chef.”
it's the second time this week I see this link being posted and it's the second time this week I read it as a 'sturgeon'
Theodore?
“First, do no harm”.
“Surgically” is how one enters a foreign codebase, especially legacy ones.
“and a codebase that’s well setup for it,”
I’d love to see some thought from folks on how to setup a code base to be more productive with AI coding tools.
- Good automated tests which the coding agent can run. I love pytest for this - one of my projects has 1500 tests and Claude Code is really good at selectively executing just tests relevant to the change it is making, and then running the whole suite at the end
- Give them the ability to interactively test the code they are writing too. Notes on how to start a development server (for web projects) are useful, then you can have them use Playwright or curl to try things out
- I'm having great results from maintaining a GitHub issues collection for projects and pasting URLs to issues directly into Claude Code
- I actually don't think documentation is too important: LLMs can read the code a lot faster than you to figure out how to use it. I have comprehensive documentation across all of my projects but I don't think it's the helpful for the coding agents, though they are good at helping me spot if it needs updating.
- Linters, type checkers, auto-formatters - give coding agents helpful tools to run and they'll use them.
For the most part anything that makes a codebase easier for humans to maintain turns out to help agents as well.
One can be a "UI Prototyper" as a full time job... Sounds so nice.
The Mythical Man-Month focuses on a simple idea.
It can be summarized as "adding more workers to a project does not speed things up, that's a myth".
It's in the title of the book. It's a good book.
The entire IT field is about to test that idea in a massive scale. Can lots of new automated workers speed things up? We'll see.
Beautifully designed site. Looks sumptuous on mobile..
Yeah, downvote his ass!
Another way of saying this is:
If your development team consists of autistic junior programmers with eidetic memory, then you damn well better make sure that your documentation is exceedingly thorough, absolutely unambiguous, and as restrictive as you can make it.
Oh- oh god! Blood everywhere!
I wonder if OP reads all the code the AI gives him.
I doubt it, quite dangerous.
I sometimes use AI summaries to get the answers I need out of badly written documentation. That's about as far as I find any value or productivity boost.
Consider that this "surgeon" analogy has always been applicable when docs or books are better written and full of useful examples. Also consider that a lot of the annoying "plumbing code" you probably want AI for is fairly closed-ended as there are only so many combinations of API use possible.
I'm really not understanding the continued hype in 2025.
How much time have you spent running a coding agent like Claude Code, and have you tried running one in auto-approve (aka YOLO) mode yet?
I've written a bunch about those recently: https://simonwillison.net/tags/coding-agents/ - including a couple of video demos https://www.youtube.com/watch?v=VC6dmPcin2E and https://www.youtube.com/watch?v=GQvMLLrFPVI
> How much time have you spent running a coding agent like Claude Code
I spent a month trying to get it to build ImGui apps in C++ and Python. I did ImGui because I wanted to try something that’s not totally mainstream. I cancelled before the second month.
It would essentially get stuck over and over and could never get itself out of the mud. In every case I had to either figure out what was wrong and explain it, or fix the code myself, which was mentally taxing after it makes a batch of changes and, because I don’t trust it, I have to sort of re-validate-from-scratch every time I need to dig it out of the mud.
In the end it wasn’t faster, and definitely wasn’t producing better quality. And I’m not particularly fast.
The best analogy I can come up with is it’s like the difference between building a SpaceX rocket, and making a cake that looks like a rocket. You probably think that’s absurd, no one would try to use a cake that looks like a rocket. But it’s a really good cake. It blows smoke and has a system of hydraulics and catapults that make it look like it’s taking off. And the current situation is, people look at the cake rocket and think, “it’s not perfect but it’s basically 70% there and we will figure out the rest before long”. Except you won’t. It’s a cake.
And then we point to the documentation, and tests it built as evidence it’s not cake. But those are cake too.
So then the question is, if you live in a simulation made of cake but it’s indistinguishable from reality, does it matter?
Ouch, sounds like its C++/ImGui abilities aren't up to snuff yet.
I tend to use it for Python, JavaScript and Go which are extremely well represented in its training data. I don't know enough C++ myself to have tried anything with that language yet! Sounds like I'd be dissapointed.
Why would you want to run one in auto approve mode?
Because it is massively more productive than when you have to manually approve what it's doing. You can literally leave it running for an hour and come back to a fully baked solution (or an unholy mess depending on how the dice rolls go).
I gave a talk about this earlier this week, slides and notes here: https://simonwillison.net/2025/Oct/22/living-dangerously-wit... - also discussed on HN here: https://news.ycombinator.com/item?id=45668118
Simon, we don't care.
You don't care.
And OK, I get that my persistence in trying to help people understand this stuff can be annoying to people who have already decided that there's nothing here.
But in this case we have someone who looks like they are still operating on a 2024 model of how this stuff can be useful.
The "coding agents" category really does change things in very material ways. It only kicked off in February this year (when Claude Code released) and if you haven't yet had the "aha" moment it's easy to miss why it makes such a difference.
I'm not going to apologize for trying to help people understand why this matters. Give how much of a boost I'm getting from this stuff in my own work I honestly think it would be unethical for me not to share what I'm learning.
> Code like a surgeon ... As a UI prototyper
XD yes sure. I'd most definitely put those on the same level. Maybe even favor an UI prototyper if it comes down to the real deal. Who needs an open heart surgery when you can have a magnificent css-hover animation that really seals the deal on some 90%-AI-generated slob that only caters to delusional top-management completely out-of-touch with reality.
Irony off: Let's try it with a bit of humbleness next time, ey?
As a coder and a neurosurgeon, lol.
Not to be confused with coding like a sturgeon which is blub blub blub blub
My god, don't code like a surgeon!
Code like a programmer.
This article further emphasizes the obvious point that as programmers are well on the way to destroying their own profession and their work is poised to wreck the entire world, that it's time raise awareness that programmers work with much less discipline and responsibility than other professional, accredited, licensed trades, like say... doctors.
But for now, sure, why not just compare yourself to surgeons, as you already anoint yourselves as "engineers".
If we're going down this road, then I claim programmers are more skilled than engineers or doctors.
90% of engineers are not inventing new things--they are merely applying codified knowledge (which they didn't create) to a new location or instantiation. In contrast, every new program is unique in the world (if it were not then it wouldn't need to be written--you'd just copy/fork/download the program).
And don't get me started on doctors. More than 40,000 Americans die each year from medical errors[1]. You really think the casualty rate from programmers--with all our mistakes--can beat that? I don't think so.
So, yeah, maybe surgeons are not the right model. Maybe surgeons could learn a thing or two from programmers.
-------------
[1] https://journalofethics.ama-assn.org/article/medical-errors/...
Sometimes I wonder how many people die indirectly from specific decisions. Like, large A-pillars in cars definitely sometimes save people when the car rolls over, but how many people die each year because a large A-pillar makes a blind spot and the driver hits a pedestrian? How many people die each year due to traffic slowing down ambulances?
People dying directly due to software bugs (e.g. Therac-25) is pretty rare, but what about an inefficient network that caused congestion and made a heimlich maneuver youtube video load slightly too slowly to save someone from choking to death? I don't think there's any way to measure it, and it's almost certainly important to train surgeons more than software engineers, but I still do wonder.
I agree with you: indirect causality is what kills people. Directly attributable causes are rare because they are easy to fix.
I also suspect these indirect causes are not easy to fix. It's not like 1 or 2 bugs cause 10,000 indirect deaths. It's more like 10,000 different bugs/design flaws cause 1 or 2 indirect deaths each.
> 90% of engineers are not inventing new things ...
That raises the question of just how much of a difference you need for something to be an invention. Merely copying another program is not invention, while coming up with and implementing TCP definitely is. But is implementing another instance of a CRUD app invention? Is configuring a server invention? What about if the configuration is written by a script? The dividing line seems harder to pin down than one might at first think.
Is another CRUD app any different than another static-force analysis on a house beam? How many engineering decisions are made by software running the math and spitting out numbers?
I agree that there are degrees of invention in both, but my argument (which is, admittedly, emotional in nature) is that there are more unique decisions in software, mostly because if you don't need to decide--if there is already an algorithm for it--then you can just plug in an existing library/app.
You only need a programmer when you can't just run an existing app. Therefore, all programmers create things that don't already exist. That's the definition of invention.
I know I'm exaggerating here, but it's less of an exaggeration than saying that programmers are "poised to wreck the entire world".
> You really think the casualty rate from programmers--with all our mistakes--can beat that?
I think that if my infrastructure + code had any direct connection to patient outcome, there would be a lot of harm done. Not that I'm particular bad at either, but I know the effective cost of errors is minimal, and certainly does not have a direct impact on people's health. If I had the same responsibilities as a surgeon, I'd have a much slower rate of change in my systems.
I do not in any way believe that the fact that we in IT kill fewer people than surgeons has any meaning for whether we're more skilled than doctors.
This is a good point.
My comment is really an emotional reaction to the (very common) denigration of software engineers, so don't take it too seriously.
But I also think that good software engineers can scale the reliability of their software to fit the purpose. There is a ton of software in medical devices, and despite the well-publicized fatal bugs, 99.9% of medical software works without error.
In fact, the automation of things like drug dispensing at pharmacies has decreased mistakes. I think if you deleted all the medical software in the world, deaths would increase.
This is a good perspective. In most disciplines where they need some level of repeatable quality, like medicine or construction, they have constrained the environment to such a degree that they can engage in somewhat repeatable work.
Consider how well cars would work if they had to traverse arbitrary terrain. Instead, we have paved the earth to make their usage somewhat consistent.
Construction projects repeat the same tasks over and over for decades. Surgeons perform the same surgeries for decades. But in software if something is repeatable and useful, it becomes an app or a library or a framework.
Some things in software have been heavily constrained to make the terrain navigable. Cloud computing in containers is one example. But in general, there’s a much higher degree of navigating uncharted territory in software than these other fields. Even building a CRUD app is often wildly complex, not the CRUD part, but the mapping of a specific business’s processes to that CRUD app, and getting those specific employees who currently work there to use it, is itself quite a novel undertaking.
Written like a true programmer.. who inflated his ego so much his belly might pop
There is no shame in being a "true programmer". Thank you for the compliment!
Analogies aren't comparisons.
A surgeon has 4 years of undergraduate education, 4 years of medical school, and a 5 year residency, learning to operate (pun intended) with other equally highly trained specialists, many of whom are peers, like anesthesiologists, not merely support. The comparison was already dubious when Brooks made it for operating systems programming. Setting up a comparison with the average "I don't use anything I learned in my CS degree, lol" coder wrangling a chorus of hallucinating stochastic parrots is a bonkers level of hubris from techbros.
Most surgeons don't use anything they learn in medical school or residency either. It's usually their last 2 - 4 years (depending on whether they did a fellowship) that is useful in their day to day job. E.g. an eye surgeon doesn't need to know how to read an ECG
Analogies aren't comparisons.