This works, until it doesn’t. I’m continuously shocked by these stories, where so many people put the future of their job/company in the hands of these agents after only a few months of existing.
I still constantly run into bad output from LLMs, from code to basic questions. I don’t understand how anyone can hand things over to something that is laughably wrong on a pretty regular basis, often in subtle ways that won’t be noticed by someone who isn’t reading closely and thinking critically.
They’ve gotten better, but I still regularly give them the old Nick Burns treatment, push it out of the way, and do it myself.
There's nothing shocking about this. The vast majority of software/source code is pretty terrible anyways, code that is full of bugs, slow to use, has little to no automated tests and very hard to maintain.
To the extent that it gets fixed or works at all, it's not because of competent developers doing rigorous analysis of the software, it's because either someone testing it or using it gets annoyed, reports an issue, and then that specific issue gets patched out.
If using LLMs to perform a similar function shocks you, then you should have been shocked already by the proliferation of pretty bad software for the better part of the last couple of decades.
So many criticisms of LLMs assume that people have been writing software very diligently, applying a high standard of engineering, subjecting the code to a battery of rigorous tests, passing it through a strict review process... and that does happen for some software, especially software that is commonly used, but it's not true for the vast majority of software developed.
AI is no good, but neither are people, isn’t a great sales pitch.
I think for small tools that people want to make for themselves, that’s great. Where I see a problems are when other people and money get involved. If something goes wrong, who is accountable? Claude wrote it, Claude reviewed it, Claude submitted the PR… yet Claude can’t have any real accountability.
I think small tools people make for themselves is realistically less than 1% of software produced. Most of the code, and - to the GP’s point - bad code, is produced in corporations with plenty of money and budget.
There is just such a tremendous amount of waste at every company, in that the headcount and software expands to fill the budget. I’m not defending Elon, but look at how much he slashed from X (80% or so?) and the company still has its core product functioning and an active user base.
There is a ton of software (especially internal) at essentially every company that also is low accountability before Claude. “Oh Ted built that but he’s working on a new important project. I understand it’s broken and that’s impacting you but we won’t be able to prioritize this until next quarter at least. Can you set up a meeting next month to discuss?”
Honestly the outcome for all of these LLMs is indeed is likely a higher amount of software with no accountability, but it’s also an improved ability to juggle more of that software to the same (realistically low) standard.
It's an absolutely phenomenal sales pitch to executives. A ton of automation is sold on the basis that it's probably not going to be as good as having a dedicated person do it, but that automation leads to much lower maintenance scales better, is more deterministic and reproducible.
It's a really fun philosophical exercise to ask what it means for them to be "wrong." My perspective is that they are fantastic at association and generalization (of language and symbols in particular), but whether they're identifying the associations you care about or generalizing to the level of abstraction you're aiming for is a complete crapshoot. If you aren't checking and correcting them, and discarding the misfires, you will end up with a very pretty Tower of Babel.
One area where I feel safe saying they are “wrong”, rather than just going with a different assumption that was left unsaid, would be when it makes up API endpoints. It sees the general pattern in an API, then makes up an endpoint that sounds good, follows the pattern, but isn’t actually implemented.
I’ve also seen a lot of issues with co-workers using an LLM to write their readme files. I look at the readme for what return values I should get, go to use them, and get an error. I check the code, and sure enough, none of the variables in the readme exist. The LLM just through they sounded good. Things like this I would say are pretty objectively wrong.
it was hype all day long and managers forgot that ai is tool and not some magic stick. tool like dewalt or makita. after ai went out i got expected from some collegues at company to generate 600 700 lines of code or more, i tried to explain i cannot read or understand whats actually happening that fast, but they were like just push, go, copy paste it. complete autodrive mode, insane. then i spend weekend fixing it, making me double mad. whats actually happening is retarded, cos of all stories out there managers thinking that claude generate perfect code, and u could make twitter clone in half a day...
My personal experience: writing code has always been the easy part. AI does most of that now.
Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load. I’m doing a lot more of that lately.
I’m more productive, but also more tired. This may be due in part to the breadth of what my team owns, which makes my day a bit more context-switchy than other teams.
As others in this thread have noted, the situation is still evolving. However, I worry less each day about being replaced by AI. There has always been more work than available bandwidth in my experience.
What seems clear to me is that expectations around velocity and throughput will increase (are increasing). AI use will be required to meet those expectations. Learning to use this new tool effectively will be essential for career progression (and preservation).
> My personal experience: writing code has always been the easy part. AI does most of that now.
The only reason dev jobs paid more (by a factor of two or more) than pure solution modeling was because "writing code" was the hard part.
If you wanted to get paid just modeling the solution and handing it off to a coding team, those jobs were available for decades, typically called Business Analysts but few devs moved from dev to BA.
> Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load.
I've found that the act of physically writing refines my understanding a lot more than simply reading.
We don't typically expect a person to read a trigonometry textbook and then perform well on an exam. They have to drill problems to surface their misunderstandings to themselves.
My fear is that, with developers adopting your approach, they're "designing" systems in much the same way that a read-the-book-only trigonometry student solves trigonometry problems.
Agree. Also, there is a lot fog at the moment. AI generates more code, we need a lot of markdowns now to teach it how to write "good code"... and <insert here a lot of AI processes>. But at the end... a programmer has to take ownership of that code and responsibility, meaning: reading A LOT of code and/or coding more code.
Responding to my own comment to add that I think this moment favors the curious and passionate. None of what I wrote above is a complaint. I’m having more fun now than I have in a long time.
I have had some truly spectacular results that still kind of stagger me in the last few months using Claude in my hobby projects -- but even though Claude insists on trying to slip its name into the git history as credit it's not Claude -- it's me. Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background. A suggested axiom: there is nothing I can build with Claude that I could not build myself with my current level of CS knowledge, assuming I had infinite focus and time. In my hands it can go as far I could anyway, and no further. (But it is faster!) My experience bears that out so far.
Fair enough but speed, especially the kind that comes with LLMs, is fast enough to open new ways of working and doing things. We don't have infinite time and if there's something that can give me multiple, for example, UI suggestions in a minute which I can pick from, it's a different way of working than sitting with a UI designer for several hours have discussions. So, while I agree with you in theory, I don't fully agree with you in, what I think you're implying, when it comes to practice.
Claude writes probably 95% of our code now, fintech, amongst top 5 in the world in what we do. I am 100% certain we're not even at the forefront of using agents for coding compared to some others.
> Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background.
This, to me, is the biggest differentiator. In terms of results, there's a huge yawning chasm between the person who says "Claude make me a $thing" versus the person who puts in the effort to lay down the overall architecture, gives some thoughts to libraries and dependencies, performance trade-offs etc, and only then begins prompting.
Knowing how to implement Djikstra or a linked list by heart is no longer important. Actual software engineering skills are more important than ever.
The gap is closing; a shitty wannabe programmer will eventually learn the structures one way or another. Agentic coding just got many new people involved, and these new people create noise.
I'm a Senior Freelance Programmer, I can see many of my past and present clients moving towards the exact path you described. I keep warning them during meetings that Claude model isn't sustainable for long, eventually the VCs will come for their revenues and Claude will be forced to close their access to all but the most enterprisey ones with deep pockets. The mere electricity cost for that kind of high level reasoning and abstraction can't be subsidized forever. However, there are other forces which pull them towards Claude and AI workflows. Most of the clients are in a "wait and watch" mode right now, using LLM assistance for code generation but not fully depending on them.
Before LLMs came, there used to be the technical debt to deal with in a project, now there is also the added cognitive debt which is way more subtle and impactful long-term. If your source of truth isn't source code but a prompt (or even a series of prompts with branches) and the executor of prompts is a non-deterministic agent, I think you've already lost the battle there.
Using today's model prices as a rebuttal is a very weak argument.
Two years ago, SOTA was gpt-o1, and it was much more expensive than Fable. Now, for $4,699, you can easily run a much smarter Qwen3.6-35B locally with DGX Spark.
Think about where we are. This is an era where a new SOTA arrives every two months. It took LLMs only about 18 months to go from chain-of-thought reasoning to disproving the unit-distance conjecture. chatGPT itself is only three and a half years old.
DeepSeek V4, released two months ago, is almost as cheap as the electricity costed, has the ability to being absolutely a top-tier model in 2025 standards.
You ignore that Claude are not alone, tech progresses and reduce costs, and there are always the Chinese alternatives which are becoming sufficiently better over time.
Low-skill work that used to be outsourced will go to cheaper LLMs, unless wages are depressed enough / running costs are high enough to keep using humans as cogs in the machine. This will also consume a ton of small-scale things, like personal-sized automation and small-business customization of better-crafted things (stuff that normally wouldn't be paid for in the first place, or only extremely rarely). Some will obviously exist, because paying someone else to farm out a ton of mediocre output with LLMs is still worthwhile sometimes, but it's going to be gutted as a general statement.
Especially with prototyping-style work, LLMs are clearly good enough for a ton of business-oriented proof-of-concepts, and that line of work is essentially dead. Unfortunately a lot of mid-tier art falls into this category as well, particularly because execs very clearly can't tell good art from bad (on a "customers like this" scale, with functionality being the judge, which is fairly objective. not a subjective "this is good art").
High-skill work is still necessary, but it's hard to tell if it's actually going to be more important (because skill is obviously still needed for actually-good results, and I honestly see no evidence that this will change with current tech) or less (primarily due to less demand, and it being significantly harder for non-skilled to judge skill when everyone can prototype something seemingly-impressive in a weekend). Some will very obviously continue to exist though.
Whether this means "high-skill people are going to be fine, stay the course" or "<10% of high-skill people will be fine, you had better be scrambling right now or looking for a new line of work" is... much less clear.
The profession has already changed. For the past eight months, AI has been competent enough to code like the best human programmer, but strangely, the software isn't any better yet. Everyone has lost sight of what the profession truly is. It's not just about coding; it's about software engineering. Our role is no longer that of programmers, AI has taken over that role. Our role is that of engineers who manage programming agents. Every attempt to have AI develop a medium-to-large project fails because the goal is to solve everything with a magic four-line prompt. We're forgetting the structural aspect, the engineering side. We must treat the tool as just that: a tool. The direction and responsibility remain in our hands. It's not about reviewing the code line by line; it's about ensuring that the product faithfully represents a well-planned engineering intent. That's why the concept of AI-augmented Software Engineering is so important.
> AI has been competent enough to code like the best human programmer
It’s really not. Opus 4.8 can’t produce good software design and it still makes straightforward implementation mistakes. Two errors it made in one day for me recently: it built the Cookie class I asked for without a name field—cookies have a name and a value—and it neglected to handle a case where a database could have multiple rows with the same id, just returning whatever came back first.
The “best human programmers” absolutely would not have made those mistakes. At worst, they would have asked if I really meant what they thought I meant.
I understand your point, but what you're describing is exactly the kind of mistake even the best human programmer could make in a poorly managed environment. I'm concerned that since AI emerged, we've overestimated our programming abilities. The comparisons we make between our own work and AI are based on an assumption of absolute perfection that doesn't exist in reality. Bugs aren't an invention of AI; they're ours. All modern software engineering, testing systems, version control systems, and so on, were developed through years of dealing with our own mistakes. We don't make systems fault-tolerant by understanding that failures are external to our work. These failures are our doing, and now they're AI's doing too. We have to deal with applying to those agents the management that we previously applied among ourselves. The example you provide is very good, because you yourself, with your human mind that solves problems, suspect that the origin of the problem was poor communication, and you are very likely right, but just as if it were a human error, the programmer is responsible for their faulty code, but you are responsible for poor process management, and yes, the same applies to working with AI agents.
What are you writing that Claude is actually writing all of it? Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash. Claude seems really great at fix this unit test, generate this boiler plate, take this uml and build this framework out. But when I am doing refactorings, or implementing things that are beyond monotonous, I end up writing it all by hand. My best luck is still do the design, query AI for possible choices, sketch out the framework of what I am writing, have AI critique my plan, and then have AI design individual methods, then fix what it writes.
What you say could be theoretically possible, but it's probably an issue with your usage of if. For eg: if any of this hard non-promptable project is available on github, or you've seen this problem in any large scale github project, you can share that. I've rarely seen a repo and a problem that claude can't chew through with the right prompt.
> Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash.
Is a skill/PEBKAC issue. You still need to exercise engineering best-practices like decomposing work to the smallest unit before taking a task on, brainstorming design first and implementation last, clearly defining your success criteria and requirements before beginning any work, etc.
I'm on a >10yr old codebase and have been able to get my org to orchestrate entire features, fully unit tested, e2e tested, storybooked, from scratch without touching an IDE. Refactorings and the endless mountain of 80% completed migrations from one pattern to another are now trivially able to offload.
Point your SOTA de jeur at the original docs, a few of the original examples/PRs and have it draft a skill describing the work, the scope, and the success metrics. Iterate on the skill with the main agent by subagenting to test the skill until you are happy with the result and it mostly gets it right with the guardrails you've defined. Again - keep the scope extremely small. It gives much less rope for the agents to hang themselves with and it is less cognitive load when you have to review/test the PR.
Then set up a reasonable cadence for it to execute an autonomous thread on and review when you get comfortable.
----
The issue I've been running into lately is simply that we've got so many PRs coming in that actually doing thorough human reviews on them is not sustainable relative to the rate the team is creating agents to open them and people (especially juniors and mid level) are getting burned out by essentially having entire days where they are just doing code reviews.
"Computer" use to be a job title. So no, I am not optimistic about the future of most programmers, maybe even all programmers.
One possibility is that software starts to look more like traditional manufacturing.
The machine is the company’s core asset.
The engineer only needs to know how to operate the machine well. Once that happens, the barrier gets much lower, need much less people, and the job naturally become much less valuable.
Some parts will still need to be done by hand, of course. But only a very small part.
It is like old factories. They used to need lots of fitters, at all levels of skill. Now you only need a few of the elite ones.
AI is the CNC machine of the software industry.
The more pessimistic future is that, maybe five years from now, the best programmers will look at AI the same way the best Go or chess players look at AI today:
Like KeJie said, "I don't even know what I am trying so hard against."
We now have a new SOTA every two months. It just took 18 months for LLMs from reasoning models to disproving the unit distance conjecture. ChatGPT itself has not even existed for as long as a college student spends in university.
In any case, we have already passed the point where this can be rolled back.
Maybe ten years from now I will be leaving a comment saying that "programmer" used to be a job too :-/
Programming is the low-hanging fruit for AI.
Open source and knowledge sharing have given it huge amount of public, high-quality training data at a level other industries can hardly imagine.
And almost everything in programming can be tested and verified inside the computer quickly in a closed loop. No robot arm is needed.
The main weakness of current LLMs is still that they are static:
They do not really change themselves through use. Harness tools are just elaborate ornamentation on top of prompts. LLMs are frozen at the moment training stops.
Once we get models that can change their own weights through self-feedback, then maybe AGI really is on the horizon.
Thinking optimistically: I may be lucky enough to see it in my lifetime. Maybe by then, people will be able to live more like human beings, instead of organizing their whole lives around work :-)
Thank you for your comment. I enjoyed it a lot. Good food for thought.
Your analogon is a bit leaky abstraction in the sense that it misses out on the broad stastical nature of LLMs. However, I find it is a good way to illustrate the potential industrial transformation.
It is hard to say what the future will bring. The original AsK HN post is definitly an omen for things to come.
Basically, in a decade or so, we'll be completely out of the loop in software development; even this title won't exist anymore (like the 2000's webmaster). We'll still be around, but with different roles.
For what it’s worth, I find comments and articles with assertive predictions like this difficult to take at face value.
I don’t even disagree with the premise, but it shifts the burden of assessing likelihood back onto the reader, so it doesn’t really add much value to me.
Software was never a precise occupation. It was a tool shed that obeyed the Pareto Principle: 20% of the people deliver 80% of the value.
This was the expectation because hiring was the primary goal, not product delivery. The industry could have fixed this by upgrading itself from an occupation to a profession built upon standards and credentials. That never happened because the 80% would become unemployable, which adds friction to hiring (the real business goal before COVID).
Now, that ship has sailed and there is no going back. The Pareto Principle is now a 5%/95% funnel because there is less incentive to learn to do the work. Good luck!
Eventually the costs of running inference will catch up to us, then we will see. But LLMs are really expensive and it is possible that with the incredible amounts of code they generatr it might become too expensive for them to keep up with it. There should be some kind of equilibrium which might take a while to reach but I think knowledge work won't disappear.
However many people have rightfully been saying it for years before LLMs that many so-called software engineers had no business in this field because for a lot of them it was just a way to earn more money than peers. It's not an issue by itself, just a rational human choice but the fact that it was possible was just because of unhealthy economic conditions.
My impression is that smaller companies, that depend on rapid prototyping to gain clients, exert a lot of pressure onto their devs to use LLMs. At least that's the situation in the companies some friends of mine are working at.
I'm in a slow-moving, much bigger company. Lot's of talk about "AI" here and we can use copilot if we want to, but there is 0 pressure. I'm in a small team and one colleague uses copilot often. In the beginning there was a minor conflict between him and me, because I found the quality of the LLM code unacceptable and had to ask him to review it more carefully. I think that's settled now, but it makes me sad how a once motivated colleague now seems to try to cheat his way out of work.
I personally find it incredibly boring to write copilot prompts or read its answers full of boiler plate and sycophancy. I don't understand how anyone would want to replace the cognitive work of programming, that I find enjoyable for the most part, with the cognitive work of "talking" to an LLM.
Anyway, I think it will be like this at least for a little while longer: only vibe coding allowed in small companies and less vibe coding the bigger the company is.
But before vibe coding can take over the slow-moving big companies, all the accumulated technical debt will come back to haunt us and vibe-free software will be the new fad. That's what I hope at least.
Its similar for us to a certain extent. I honestly don‘t know yet where this will lead to.
Personally I also don‘t really follow the arguments that the agent does the coding and the human does the understanding.
In my opinion one is thinking differently about the code when not coding it by himself, on a higher level or lets even say a more superficial one.
To keep the understanding on the same level you would have to limit the agent to just „typing“ but this is definitely not whats happening.
Yes, may be a skill issue[tm], may be an inherent one or it may not even be relevant anymore, we will see.
For me it currently still works well because I am working on legacy systems I more or less have a good understanding about - so I can judge the agent code.
Not sure how this will be with new, green field code.
Not even starting with the discussion how it should be possible to review the sheer amounts of generated code.
From my experience, Ai right now is not perfect and people still treat it as if it was, leaked secrets, create 10 bugs to solve one minor issue, whole backend a mess that won’t resist at scale. Even though it is improving and all this issues will disappear, Ai will take over most of the technical hurdle, what will be left?
Materializing and scoping down broad idea ( or just abstract vision ) into what needs to be built, it’s not the same to say.
“Hey Claude build me a fitness app”
Thank actually understanding your customer behind it, the behaviour and psychology, the journey and what is the actual problem you’re solving.
Tech in general and programming emerged as a need to create new things in a digital world to solve our problems, building them just got easier but understanding the problems and the people behind them, that’s gonna be an increasingly necessary task
It's disappearing. Even if models stop evolving tomorrow, there's still enough potential in the harness improvement to reach the point where anything 99% of us know about software engineering is useless. How many humans will be involved in creating software after the dust settles is anybody's guess, but I wouldn't bet on it being anywhere near the current level.
- LLM adoption varies across the org. Some are heavy users and some less. Some suspicious some less.
Where are we heading? Depends on model/harness capabilities. Likely some sort of mix where some projects will still require expert humans and others will just be vibe coded. How much we lean in that direction - we'll see.
I will just say, if you are any good at programming and have experience using agents, you're in the top 0.1% of the world in adoption of a critical new technology.
It may seem hopeless as a programmer, but imo you'd be much better off reframing your situation re: the above sentence.
Iunno, I feel like being born in a first world country did most of that "top 0.1% of the world" work. That sentence works the same with and without AI/LLMs.
Among peers, I feel like I am top 20%? 30%? maybe, by being a good programmer who is adept at agents. A year ago was the 0.1% point, this stuff is spreading like wildfire. A year from now I think it's going to just be de rigeur that these are our tools now.
Worse, any edge I have from working with this stuff for years is quickly dulled. The tools are evolving fast. My tricks from 3 months ago have been eclipsed.
"Well, in our country," said Alice, still panting a little, "you'd generally get to somewhere else—if you ran very fast for a long time, as we've been doing."
"A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
Not the same thing. Developers' clients are being approached by thousands of people instead of a handful. It creates the illusion that everyone can do the same thing for cheaper.
For the last 6 decades or so, a computer was a machine assumed to operate with high levels of precision and deterministic outputs. Such precision enabled spacecraft like Voyager 1 & 2 to travel billions of miles from Earth, staying on course, semi-operational and sending telemetry- 50 years after launch.
Now we have machines that, when asked to produce a paperclip, may instead produce a butter knife, or a banana, or maybe just a "try again later".
These modern "tools" are quite a different animal. They're more akin to roulette wheels that generate massive amounts of heat and CO2.
This is cope. sota agents produces what's asked exactly, usually it's the asking that's the problem not the result, improve the prompt and the output drastically improves.
- spending exorbitant amounts of time up front planning, surfacing every milestone, subtask, and individual change, burning tons of tokens in addition to man hours fixing minute but important mistakes or derivations despite explicit instructions, CLAUDE.md, memory subsystems, agent and skill instructions, etc.
- executing and reviewing results compared to the plan and finding even more mistakes and derivations from the plan often takes a lot of time
- the relative rapidity produces a lot of output for the team which stresses our lifecycle and introduces feelings the whole process is on the verge of flying off the rails
- individual developers have different expectations, definitions of acceptable, skills/experience to detect and deal with problems, and patience. I might spend hours to days meticulously planning and executing a ticket and another guy might yolo it in 30 minutes. Other than bug escape rate and tracking review failures I’m struggling with how to track people who are “doing it wrong” let alone telling them the “right” way to do it.
- growing exhaustion, lack of ownership and confidence, frustration, and a generalized feeling of endlessly fighting your tools but in a weird way they never seem to really improve despite all your efforts to do so
- taking humans out of the loop and letting the agents be more autonomous in hopes that we’ll reduce the bottleneck and produce better results has not helped.
- I find even myself fighting (and sometimes failing) the urge to give in even though the proposed or implemented solution doesn’t feel right. Scale that across your team
- experience has not really changed despite changes in models and harnesses
- there’s a deep feeling that I’m doing something wrong and fomo since so many people in the industry boast of incredible results. I probably am, but everything I’ve read about and tried has not really moved the needle much and it’s introducing another dimension of exhaustion and frustration
overall, I don’t feel like we’re being much more productive when you factor in quality and accountability (which should be a given, but this industry increasingly overtaken by a reckless philosophy of speed over everything else). I do think it has helped parallelize tasks, produce higher quality PoCs to explore more options and do it faster, offload joyless but necessary tasks that are narrow in scope and measurable, do exploratory work and act as a generalized interactive knowledge base, and make shallow techniques, technologies, etc. that you don’t yet have experience with. Maybe I’m just missing a critical component or two in the process (formal specification, etc). Maybe it’s growing pains, or maybe there’s a looming rot. Either way job satisfaction and confidence in my work is much lower than I would have expected.
in the olden days (pre-LLMs) we would write high-level code.
the entire layer was high-level code and rarely would we ever need to peak into the assembly:
writing, debugging, architecting, reviewing, testing - all were done in the high-level language layer.
---
welcome to present day:
since we don't write code - we write intents, we also shouldn't review code either - we should review intents.
I don't review my code anymore. I ask the agent to generate markdown docs, graphviz diagrams, changelogs, audit reports, etc. I only review that.
I also ask it to write test and evaluate by whether the tests passed or not. I don't need to peak into the tests code - I can also ask plain english, pseudocode, control flow graph, whatever it is I want.
I can ask it to find errors or missing tests and improve that too!
code is like assembly now.
rare are the cases you would need to peak into that level.
That's maybe wishful thinking from my part, but more towards like other engineering fields: Project engineers design it from scratch, everyone must speak architecture, customer and compliance at once, and we will have standards and "codes" drawn up by the end of this decade.
Same here. At our company, we've pretty much stopped writing code by hand. We hand the implementation to Claude and Codex now. Feels like the real skill is moving up a level: architecture, design choices, and knowing what should be built in the first place.
I see nothing wrong with something probabilistic. I think it is all about offsetting the risk and reducing the odds of bad outcomes. There is this concept of Defence in Depth, thus I assume some sort of binomial formula also applies here.
I mean, literally the answer is that nobody knows. Maybe the robots replace us all. Maybe they shift those who remain into being some combination of Product Manager and QA. Maybe there's still a role for a technical overseer even in the medium-long run.
But it sounds like you're really asking about the state of the world today. If so, I don't think that ideal state is like your friend's company (or at least, as it appeared to be to you). It might be possible that you can make that "dark factory" pattern work (StrongDM seems to be doing it), but it would require infrastructure and discipline that I doubt they're mustering. Think about how CD didn't involve taking a sloppy build process with no testing or observability and just going straight to prod -- it required building up a lot of infra and discipline first.
But on the other hand, I don't think the ideal present involves artisan hand-crafting code either. I haven't written a line of code by hand in enough months that it would genuinely feel weird if I were to try to program that way despite decades of having done just that. That era's done with, and moderate normie practices right now today are more about supervising and guiding agents than about chiseling code into clay tablets.
From what you said: Not looking at code is bad, not because Claude can slip a few bugs (it can), but because LLMs tend to default to writing more code and features than needed, which isn't a good thing. I see a lot of people making 10+ PRs per day, but most of them are just going back to fix earlier PRs.
Claude always likes to "go big," for example, by choosing tools that can support millions of concurrent users or by adding unnecessary layers of abstraction that create more maintenance pain. I guess that's good for LLM companies, since more tokens are spent fixing the mess it caused.
Every time I enter plan mode for a huge feature, I end up cutting about 30-60% of the task scope before the LLM can actually start the work. I review the final code, and I still find things to cut. As said before "The best code is no code, or code you don’t have to maintain" [0]
> it feels like software development is going from a precise occupation that requires high degree of understanding to something probabilistic and offloaded understanding
To me it felt like there were always engineers and vibers.
Vibers don't work systematically, never test, accept unknown regression, don't use git, and if they do, treat it like Dropbox, use terrible languages, have terrible habits. Vibers got a new tool, too, and it vastly increases the amount of slop. But the slop was always there.
The slop actually got better after vibers stopped writing their own slop, I have to say.
And vibers are less defensive about the particular slop they didn't write themselves.
"This is exactly why we built AINAScan — we found that AI-generated code passes all tests and 'works', but consistently produces the same 15 structural bugs: save functions that never write to DB, async functions with no await, parameters that have zero effect on return values. Linters miss all of these. The code looks fine until production."
My profession is not, and never was, _programmer_. Lines of code—the actual text, is a means to an end, not an end in and of itself. I'll take heat for that here for sure. But do you think a carpenter considers himself "one who screws nails" or "glues joints"? No, the small minutiae of the job was never the job itself.
I think the genie gets put back in the bottle, at least partly.
I don't think the future is massive data centers running at a staggering loss to generate questionable code.
The future is rethinking IDEs to have local models work in partnership with the developer to ease tedium and catch mistakes.
A model that maintains a visual, zoomable mind-map of the entire project, with two way binding. Code can be created visually or textually, same with data flows.
Project structure and architecture are presented in high-level ways, that can be easily altered and refactored with almost zero tedium.
I think we start using AI for what it's good for: pattern matching and transformation, and stop trying to make it reason and pretend like it's a human.
Once we, as an industry, figure this out we'll unlock a massive boost in quality and productivity, but it looks like there will be some painful times ahead before everyone realizes that the token extrusion machines are only increasing the total cost of ownership, and they are being used incorrectly when we try to outsource our thinking to them.
I think there's an enormous opportunity to build these tools right now, and that whoever nails it will win.
> I had an interview where I was asked the obligatory “what’s your Al workflow” and I said I use it for searching documentation and writing small functions or boilerplate that are tedious. Then I was asked whether I use Cursor. I said no, and immediately was told that “I’d be a better programmer if I used Cursor”. I have 13 years of software engineering experience, and was talked down by an Al startup with no minimal viable prototype. Then I was told I did not have the experience for the role. I love this timeline so much
This has always been a very different profession depending on where you work and what you're working on.
I haven't worked at a startup in over a decade, but the stories I hear now sound the same as back then. There's lots of wasted effort for mediocre to poor code destined to be rewritten or thrown away until there's enough investment to justify more work. At which point, "more work" just means more sprawling slop instead of fixing the technical debt rotting at the foundation.
AI just put a spotlight on the futility of trying to run before you can walk. Whether so many founders are going to stay in denial about it is yet to be seen. Statistics about any line of business says yes. This is how most businesses fail and most of them have to fail.
> ask claude to write, and ask claude to explain
This works, until it doesn’t. I’m continuously shocked by these stories, where so many people put the future of their job/company in the hands of these agents after only a few months of existing.
I still constantly run into bad output from LLMs, from code to basic questions. I don’t understand how anyone can hand things over to something that is laughably wrong on a pretty regular basis, often in subtle ways that won’t be noticed by someone who isn’t reading closely and thinking critically.
They’ve gotten better, but I still regularly give them the old Nick Burns treatment, push it out of the way, and do it myself.
There's nothing shocking about this. The vast majority of software/source code is pretty terrible anyways, code that is full of bugs, slow to use, has little to no automated tests and very hard to maintain.
To the extent that it gets fixed or works at all, it's not because of competent developers doing rigorous analysis of the software, it's because either someone testing it or using it gets annoyed, reports an issue, and then that specific issue gets patched out.
If using LLMs to perform a similar function shocks you, then you should have been shocked already by the proliferation of pretty bad software for the better part of the last couple of decades.
So many criticisms of LLMs assume that people have been writing software very diligently, applying a high standard of engineering, subjecting the code to a battery of rigorous tests, passing it through a strict review process... and that does happen for some software, especially software that is commonly used, but it's not true for the vast majority of software developed.
AI is no good, but neither are people, isn’t a great sales pitch.
I think for small tools that people want to make for themselves, that’s great. Where I see a problems are when other people and money get involved. If something goes wrong, who is accountable? Claude wrote it, Claude reviewed it, Claude submitted the PR… yet Claude can’t have any real accountability.
I think small tools people make for themselves is realistically less than 1% of software produced. Most of the code, and - to the GP’s point - bad code, is produced in corporations with plenty of money and budget.
There is just such a tremendous amount of waste at every company, in that the headcount and software expands to fill the budget. I’m not defending Elon, but look at how much he slashed from X (80% or so?) and the company still has its core product functioning and an active user base.
There is a ton of software (especially internal) at essentially every company that also is low accountability before Claude. “Oh Ted built that but he’s working on a new important project. I understand it’s broken and that’s impacting you but we won’t be able to prioritize this until next quarter at least. Can you set up a meeting next month to discuss?”
Honestly the outcome for all of these LLMs is indeed is likely a higher amount of software with no accountability, but it’s also an improved ability to juggle more of that software to the same (realistically low) standard.
"A computer can never be held accountable
Therefore a computer must never make a management decision"
-- Internal IBM training manual, 1979
It's an absolutely phenomenal sales pitch to executives. A ton of automation is sold on the basis that it's probably not going to be as good as having a dedicated person do it, but that automation leads to much lower maintenance scales better, is more deterministic and reproducible.
> little to no automated tests
I'm still amazed people don't achieve extremely high test quality, since you get tests "for free" now.
One of the limitations of testing were always that people "design" things so they're hard to test.
And then they argue "This can't be tested", or "Refactoring this for testing is not worth it."
It is now. Yet, I work on codebases with no tests and lots of yolo co-authoring.
AI is just a tool, and, as always, people will use it incorrectly and lazily. Are we forgetting the good old days of Copy/Paste from Stack Overflow?
LLMs just made it more convenient for the same people to take the lazy route.
It's a really fun philosophical exercise to ask what it means for them to be "wrong." My perspective is that they are fantastic at association and generalization (of language and symbols in particular), but whether they're identifying the associations you care about or generalizing to the level of abstraction you're aiming for is a complete crapshoot. If you aren't checking and correcting them, and discarding the misfires, you will end up with a very pretty Tower of Babel.
One area where I feel safe saying they are “wrong”, rather than just going with a different assumption that was left unsaid, would be when it makes up API endpoints. It sees the general pattern in an API, then makes up an endpoint that sounds good, follows the pattern, but isn’t actually implemented.
I’ve also seen a lot of issues with co-workers using an LLM to write their readme files. I look at the readme for what return values I should get, go to use them, and get an error. I check the code, and sure enough, none of the variables in the readme exist. The LLM just through they sounded good. Things like this I would say are pretty objectively wrong.
it was hype all day long and managers forgot that ai is tool and not some magic stick. tool like dewalt or makita. after ai went out i got expected from some collegues at company to generate 600 700 lines of code or more, i tried to explain i cannot read or understand whats actually happening that fast, but they were like just push, go, copy paste it. complete autodrive mode, insane. then i spend weekend fixing it, making me double mad. whats actually happening is retarded, cos of all stories out there managers thinking that claude generate perfect code, and u could make twitter clone in half a day...
My personal experience: writing code has always been the easy part. AI does most of that now.
Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load. I’m doing a lot more of that lately.
I’m more productive, but also more tired. This may be due in part to the breadth of what my team owns, which makes my day a bit more context-switchy than other teams.
As others in this thread have noted, the situation is still evolving. However, I worry less each day about being replaced by AI. There has always been more work than available bandwidth in my experience.
What seems clear to me is that expectations around velocity and throughput will increase (are increasing). AI use will be required to meet those expectations. Learning to use this new tool effectively will be essential for career progression (and preservation).
> My personal experience: writing code has always been the easy part. AI does most of that now.
The only reason dev jobs paid more (by a factor of two or more) than pure solution modeling was because "writing code" was the hard part.
If you wanted to get paid just modeling the solution and handing it off to a coding team, those jobs were available for decades, typically called Business Analysts but few devs moved from dev to BA.
> Understanding the problem and the existing system well enough to design the right solution, even with AI assistance, is a higher cognitive load.
I've found that the act of physically writing refines my understanding a lot more than simply reading.
We don't typically expect a person to read a trigonometry textbook and then perform well on an exam. They have to drill problems to surface their misunderstandings to themselves.
My fear is that, with developers adopting your approach, they're "designing" systems in much the same way that a read-the-book-only trigonometry student solves trigonometry problems.
Agree. Also, there is a lot fog at the moment. AI generates more code, we need a lot of markdowns now to teach it how to write "good code"... and <insert here a lot of AI processes>. But at the end... a programmer has to take ownership of that code and responsibility, meaning: reading A LOT of code and/or coding more code.
Responding to my own comment to add that I think this moment favors the curious and passionate. None of what I wrote above is a complaint. I’m having more fun now than I have in a long time.
Spot on, in my experience.
I have had some truly spectacular results that still kind of stagger me in the last few months using Claude in my hobby projects -- but even though Claude insists on trying to slip its name into the git history as credit it's not Claude -- it's me. Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background. A suggested axiom: there is nothing I can build with Claude that I could not build myself with my current level of CS knowledge, assuming I had infinite focus and time. In my hands it can go as far I could anyway, and no further. (But it is faster!) My experience bears that out so far.
Fair enough but speed, especially the kind that comes with LLMs, is fast enough to open new ways of working and doing things. We don't have infinite time and if there's something that can give me multiple, for example, UI suggestions in a minute which I can pick from, it's a different way of working than sitting with a UI designer for several hours have discussions. So, while I agree with you in theory, I don't fully agree with you in, what I think you're implying, when it comes to practice.
> hobby projects
Unfortunately despite being impressive for solo stuff, such results don’t scale to software you’d give to others.
Claude writes probably 95% of our code now, fintech, amongst top 5 in the world in what we do. I am 100% certain we're not even at the forefront of using agents for coding compared to some others.
It definitely can scale.
> Someone who has studied CS and software engineering for decades will craft different prompts from someone without that background.
This, to me, is the biggest differentiator. In terms of results, there's a huge yawning chasm between the person who says "Claude make me a $thing" versus the person who puts in the effort to lay down the overall architecture, gives some thoughts to libraries and dependencies, performance trade-offs etc, and only then begins prompting.
Knowing how to implement Djikstra or a linked list by heart is no longer important. Actual software engineering skills are more important than ever.
> Knowing how to implement Djikstra or a linked list by heart is no longer important.
This was never important. The important part was always knowing when to use them.
>The important part was always knowing when to use them.
Two things can be true simultaneously. I think there was a time when deep familiarity with implementing algorithms was important.
always was. Still is.
The gap is closing; a shitty wannabe programmer will eventually learn the structures one way or another. Agentic coding just got many new people involved, and these new people create noise.
I'm a Senior Freelance Programmer, I can see many of my past and present clients moving towards the exact path you described. I keep warning them during meetings that Claude model isn't sustainable for long, eventually the VCs will come for their revenues and Claude will be forced to close their access to all but the most enterprisey ones with deep pockets. The mere electricity cost for that kind of high level reasoning and abstraction can't be subsidized forever. However, there are other forces which pull them towards Claude and AI workflows. Most of the clients are in a "wait and watch" mode right now, using LLM assistance for code generation but not fully depending on them.
Before LLMs came, there used to be the technical debt to deal with in a project, now there is also the added cognitive debt which is way more subtle and impactful long-term. If your source of truth isn't source code but a prompt (or even a series of prompts with branches) and the executor of prompts is a non-deterministic agent, I think you've already lost the battle there.
Using today's model prices as a rebuttal is a very weak argument.
Two years ago, SOTA was gpt-o1, and it was much more expensive than Fable. Now, for $4,699, you can easily run a much smarter Qwen3.6-35B locally with DGX Spark.
Think about where we are. This is an era where a new SOTA arrives every two months. It took LLMs only about 18 months to go from chain-of-thought reasoning to disproving the unit-distance conjecture. chatGPT itself is only three and a half years old.
DeepSeek V4, released two months ago, is almost as cheap as the electricity costed, has the ability to being absolutely a top-tier model in 2025 standards.
> Claude model isn't sustainable for long, eventually the VCs will come for their revenues
This is cope. There are multiple open models that are already good enough and cheap enough at API rates to sustain this.
You ignore that Claude are not alone, tech progresses and reduce costs, and there are always the Chinese alternatives which are becoming sufficiently better over time.
Low-skill work that used to be outsourced will go to cheaper LLMs, unless wages are depressed enough / running costs are high enough to keep using humans as cogs in the machine. This will also consume a ton of small-scale things, like personal-sized automation and small-business customization of better-crafted things (stuff that normally wouldn't be paid for in the first place, or only extremely rarely). Some will obviously exist, because paying someone else to farm out a ton of mediocre output with LLMs is still worthwhile sometimes, but it's going to be gutted as a general statement.
Especially with prototyping-style work, LLMs are clearly good enough for a ton of business-oriented proof-of-concepts, and that line of work is essentially dead. Unfortunately a lot of mid-tier art falls into this category as well, particularly because execs very clearly can't tell good art from bad (on a "customers like this" scale, with functionality being the judge, which is fairly objective. not a subjective "this is good art").
High-skill work is still necessary, but it's hard to tell if it's actually going to be more important (because skill is obviously still needed for actually-good results, and I honestly see no evidence that this will change with current tech) or less (primarily due to less demand, and it being significantly harder for non-skilled to judge skill when everyone can prototype something seemingly-impressive in a weekend). Some will very obviously continue to exist though.
Whether this means "high-skill people are going to be fine, stay the course" or "<10% of high-skill people will be fine, you had better be scrambling right now or looking for a new line of work" is... much less clear.
The profession has already changed. For the past eight months, AI has been competent enough to code like the best human programmer, but strangely, the software isn't any better yet. Everyone has lost sight of what the profession truly is. It's not just about coding; it's about software engineering. Our role is no longer that of programmers, AI has taken over that role. Our role is that of engineers who manage programming agents. Every attempt to have AI develop a medium-to-large project fails because the goal is to solve everything with a magic four-line prompt. We're forgetting the structural aspect, the engineering side. We must treat the tool as just that: a tool. The direction and responsibility remain in our hands. It's not about reviewing the code line by line; it's about ensuring that the product faithfully represents a well-planned engineering intent. That's why the concept of AI-augmented Software Engineering is so important.
> AI has been competent enough to code like the best human programmer
It’s really not. Opus 4.8 can’t produce good software design and it still makes straightforward implementation mistakes. Two errors it made in one day for me recently: it built the Cookie class I asked for without a name field—cookies have a name and a value—and it neglected to handle a case where a database could have multiple rows with the same id, just returning whatever came back first.
The “best human programmers” absolutely would not have made those mistakes. At worst, they would have asked if I really meant what they thought I meant.
I understand your point, but what you're describing is exactly the kind of mistake even the best human programmer could make in a poorly managed environment. I'm concerned that since AI emerged, we've overestimated our programming abilities. The comparisons we make between our own work and AI are based on an assumption of absolute perfection that doesn't exist in reality. Bugs aren't an invention of AI; they're ours. All modern software engineering, testing systems, version control systems, and so on, were developed through years of dealing with our own mistakes. We don't make systems fault-tolerant by understanding that failures are external to our work. These failures are our doing, and now they're AI's doing too. We have to deal with applying to those agents the management that we previously applied among ourselves. The example you provide is very good, because you yourself, with your human mind that solves problems, suspect that the origin of the problem was poor communication, and you are very likely right, but just as if it were a human error, the programmer is responsible for their faulty code, but you are responsible for poor process management, and yes, the same applies to working with AI agents.
What are you writing that Claude is actually writing all of it? Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash. Claude seems really great at fix this unit test, generate this boiler plate, take this uml and build this framework out. But when I am doing refactorings, or implementing things that are beyond monotonous, I end up writing it all by hand. My best luck is still do the design, query AI for possible choices, sketch out the framework of what I am writing, have AI critique my plan, and then have AI design individual methods, then fix what it writes.
What you say could be theoretically possible, but it's probably an issue with your usage of if. For eg: if any of this hard non-promptable project is available on github, or you've seen this problem in any large scale github project, you can share that. I've rarely seen a repo and a problem that claude can't chew through with the right prompt.
I mean this with no disrespect, but
> Every time I get past the green field stage, I just end up throwing out what it writes half the time since its trash.
Is a skill/PEBKAC issue. You still need to exercise engineering best-practices like decomposing work to the smallest unit before taking a task on, brainstorming design first and implementation last, clearly defining your success criteria and requirements before beginning any work, etc.
I'm on a >10yr old codebase and have been able to get my org to orchestrate entire features, fully unit tested, e2e tested, storybooked, from scratch without touching an IDE. Refactorings and the endless mountain of 80% completed migrations from one pattern to another are now trivially able to offload.
Point your SOTA de jeur at the original docs, a few of the original examples/PRs and have it draft a skill describing the work, the scope, and the success metrics. Iterate on the skill with the main agent by subagenting to test the skill until you are happy with the result and it mostly gets it right with the guardrails you've defined. Again - keep the scope extremely small. It gives much less rope for the agents to hang themselves with and it is less cognitive load when you have to review/test the PR.
Then set up a reasonable cadence for it to execute an autonomous thread on and review when you get comfortable.
----
The issue I've been running into lately is simply that we've got so many PRs coming in that actually doing thorough human reviews on them is not sustainable relative to the rate the team is creating agents to open them and people (especially juniors and mid level) are getting burned out by essentially having entire days where they are just doing code reviews.
"Computer" use to be a job title. So no, I am not optimistic about the future of most programmers, maybe even all programmers.
One possibility is that software starts to look more like traditional manufacturing.
The machine is the company’s core asset. The engineer only needs to know how to operate the machine well. Once that happens, the barrier gets much lower, need much less people, and the job naturally become much less valuable. Some parts will still need to be done by hand, of course. But only a very small part. It is like old factories. They used to need lots of fitters, at all levels of skill. Now you only need a few of the elite ones.
AI is the CNC machine of the software industry.
The more pessimistic future is that, maybe five years from now, the best programmers will look at AI the same way the best Go or chess players look at AI today: Like KeJie said, "I don't even know what I am trying so hard against." We now have a new SOTA every two months. It just took 18 months for LLMs from reasoning models to disproving the unit distance conjecture. ChatGPT itself has not even existed for as long as a college student spends in university.
In any case, we have already passed the point where this can be rolled back.
Maybe ten years from now I will be leaving a comment saying that "programmer" used to be a job too :-/
Programming is the low-hanging fruit for AI. Open source and knowledge sharing have given it huge amount of public, high-quality training data at a level other industries can hardly imagine. And almost everything in programming can be tested and verified inside the computer quickly in a closed loop. No robot arm is needed.
The main weakness of current LLMs is still that they are static: They do not really change themselves through use. Harness tools are just elaborate ornamentation on top of prompts. LLMs are frozen at the moment training stops. Once we get models that can change their own weights through self-feedback, then maybe AGI really is on the horizon.
Thinking optimistically: I may be lucky enough to see it in my lifetime. Maybe by then, people will be able to live more like human beings, instead of organizing their whole lives around work :-)
Thank you for your comment. I enjoyed it a lot. Good food for thought.
Your analogon is a bit leaky abstraction in the sense that it misses out on the broad stastical nature of LLMs. However, I find it is a good way to illustrate the potential industrial transformation.
It is hard to say what the future will bring. The original AsK HN post is definitly an omen for things to come.
I've posted a recent article about the future of software development https://saturnino.substack.com/p/out-of-the-loop?r=7eqhw&utm...
Basically, in a decade or so, we'll be completely out of the loop in software development; even this title won't exist anymore (like the 2000's webmaster). We'll still be around, but with different roles.
For what it’s worth, I find comments and articles with assertive predictions like this difficult to take at face value.
I don’t even disagree with the premise, but it shifts the burden of assessing likelihood back onto the reader, so it doesn’t really add much value to me.
Software was never a precise occupation. It was a tool shed that obeyed the Pareto Principle: 20% of the people deliver 80% of the value.
This was the expectation because hiring was the primary goal, not product delivery. The industry could have fixed this by upgrading itself from an occupation to a profession built upon standards and credentials. That never happened because the 80% would become unemployable, which adds friction to hiring (the real business goal before COVID).
Now, that ship has sailed and there is no going back. The Pareto Principle is now a 5%/95% funnel because there is less incentive to learn to do the work. Good luck!
Eventually the costs of running inference will catch up to us, then we will see. But LLMs are really expensive and it is possible that with the incredible amounts of code they generatr it might become too expensive for them to keep up with it. There should be some kind of equilibrium which might take a while to reach but I think knowledge work won't disappear.
However many people have rightfully been saying it for years before LLMs that many so-called software engineers had no business in this field because for a lot of them it was just a way to earn more money than peers. It's not an issue by itself, just a rational human choice but the fact that it was possible was just because of unhealthy economic conditions.
My impression is that smaller companies, that depend on rapid prototyping to gain clients, exert a lot of pressure onto their devs to use LLMs. At least that's the situation in the companies some friends of mine are working at.
I'm in a slow-moving, much bigger company. Lot's of talk about "AI" here and we can use copilot if we want to, but there is 0 pressure. I'm in a small team and one colleague uses copilot often. In the beginning there was a minor conflict between him and me, because I found the quality of the LLM code unacceptable and had to ask him to review it more carefully. I think that's settled now, but it makes me sad how a once motivated colleague now seems to try to cheat his way out of work.
I personally find it incredibly boring to write copilot prompts or read its answers full of boiler plate and sycophancy. I don't understand how anyone would want to replace the cognitive work of programming, that I find enjoyable for the most part, with the cognitive work of "talking" to an LLM.
Anyway, I think it will be like this at least for a little while longer: only vibe coding allowed in small companies and less vibe coding the bigger the company is.
But before vibe coding can take over the slow-moving big companies, all the accumulated technical debt will come back to haunt us and vibe-free software will be the new fad. That's what I hope at least.
Its similar for us to a certain extent. I honestly don‘t know yet where this will lead to. Personally I also don‘t really follow the arguments that the agent does the coding and the human does the understanding. In my opinion one is thinking differently about the code when not coding it by himself, on a higher level or lets even say a more superficial one. To keep the understanding on the same level you would have to limit the agent to just „typing“ but this is definitely not whats happening.
Yes, may be a skill issue[tm], may be an inherent one or it may not even be relevant anymore, we will see.
For me it currently still works well because I am working on legacy systems I more or less have a good understanding about - so I can judge the agent code. Not sure how this will be with new, green field code.
Not even starting with the discussion how it should be possible to review the sheer amounts of generated code.
From my experience, Ai right now is not perfect and people still treat it as if it was, leaked secrets, create 10 bugs to solve one minor issue, whole backend a mess that won’t resist at scale. Even though it is improving and all this issues will disappear, Ai will take over most of the technical hurdle, what will be left?
Materializing and scoping down broad idea ( or just abstract vision ) into what needs to be built, it’s not the same to say.
“Hey Claude build me a fitness app”
Thank actually understanding your customer behind it, the behaviour and psychology, the journey and what is the actual problem you’re solving.
Tech in general and programming emerged as a need to create new things in a digital world to solve our problems, building them just got easier but understanding the problems and the people behind them, that’s gonna be an increasingly necessary task
It's disappearing. Even if models stop evolving tomorrow, there's still enough potential in the harness improvement to reach the point where anything 99% of us know about software engineering is useless. How many humans will be involved in creating software after the dust settles is anybody's guess, but I wouldn't bet on it being anywhere near the current level.
For me in large tech:
- Humans still own the code
- All code reviewed by humans
- LLM adoption varies across the org. Some are heavy users and some less. Some suspicious some less.
Where are we heading? Depends on model/harness capabilities. Likely some sort of mix where some projects will still require expert humans and others will just be vibe coded. How much we lean in that direction - we'll see.
I will just say, if you are any good at programming and have experience using agents, you're in the top 0.1% of the world in adoption of a critical new technology.
It may seem hopeless as a programmer, but imo you'd be much better off reframing your situation re: the above sentence.
Iunno, I feel like being born in a first world country did most of that "top 0.1% of the world" work. That sentence works the same with and without AI/LLMs.
Among peers, I feel like I am top 20%? 30%? maybe, by being a good programmer who is adept at agents. A year ago was the 0.1% point, this stuff is spreading like wildfire. A year from now I think it's going to just be de rigeur that these are our tools now.
Worse, any edge I have from working with this stuff for years is quickly dulled. The tools are evolving fast. My tricks from 3 months ago have been eclipsed.
"Well, in our country," said Alice, still panting a little, "you'd generally get to somewhere else—if you ran very fast for a long time, as we've been doing."
"A slow sort of country!" said the Queen. "Now, here, you see, it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"
Remember you had to quit social media to keep your sanity in check? Ok, now AI. Same thing.
Not the same thing. Developers' clients are being approached by thousands of people instead of a handful. It creates the illusion that everyone can do the same thing for cheaper.
For the last 6 decades or so, a computer was a machine assumed to operate with high levels of precision and deterministic outputs. Such precision enabled spacecraft like Voyager 1 & 2 to travel billions of miles from Earth, staying on course, semi-operational and sending telemetry- 50 years after launch.
Now we have machines that, when asked to produce a paperclip, may instead produce a butter knife, or a banana, or maybe just a "try again later".
These modern "tools" are quite a different animal. They're more akin to roulette wheels that generate massive amounts of heat and CO2.
This is cope. sota agents produces what's asked exactly, usually it's the asking that's the problem not the result, improve the prompt and the output drastically improves.
Anecdotal experience amongst my small team:
- spending exorbitant amounts of time up front planning, surfacing every milestone, subtask, and individual change, burning tons of tokens in addition to man hours fixing minute but important mistakes or derivations despite explicit instructions, CLAUDE.md, memory subsystems, agent and skill instructions, etc.
- executing and reviewing results compared to the plan and finding even more mistakes and derivations from the plan often takes a lot of time - the relative rapidity produces a lot of output for the team which stresses our lifecycle and introduces feelings the whole process is on the verge of flying off the rails
- individual developers have different expectations, definitions of acceptable, skills/experience to detect and deal with problems, and patience. I might spend hours to days meticulously planning and executing a ticket and another guy might yolo it in 30 minutes. Other than bug escape rate and tracking review failures I’m struggling with how to track people who are “doing it wrong” let alone telling them the “right” way to do it.
- growing exhaustion, lack of ownership and confidence, frustration, and a generalized feeling of endlessly fighting your tools but in a weird way they never seem to really improve despite all your efforts to do so
- taking humans out of the loop and letting the agents be more autonomous in hopes that we’ll reduce the bottleneck and produce better results has not helped.
- I find even myself fighting (and sometimes failing) the urge to give in even though the proposed or implemented solution doesn’t feel right. Scale that across your team
- experience has not really changed despite changes in models and harnesses
- there’s a deep feeling that I’m doing something wrong and fomo since so many people in the industry boast of incredible results. I probably am, but everything I’ve read about and tried has not really moved the needle much and it’s introducing another dimension of exhaustion and frustration
overall, I don’t feel like we’re being much more productive when you factor in quality and accountability (which should be a given, but this industry increasingly overtaken by a reckless philosophy of speed over everything else). I do think it has helped parallelize tasks, produce higher quality PoCs to explore more options and do it faster, offload joyless but necessary tasks that are narrow in scope and measurable, do exploratory work and act as a generalized interactive knowledge base, and make shallow techniques, technologies, etc. that you don’t yet have experience with. Maybe I’m just missing a critical component or two in the process (formal specification, etc). Maybe it’s growing pains, or maybe there’s a looming rot. Either way job satisfaction and confidence in my work is much lower than I would have expected.
code is like assembly now.
in the olden days (pre-LLMs) we would write high-level code.
the entire layer was high-level code and rarely would we ever need to peak into the assembly:
writing, debugging, architecting, reviewing, testing - all were done in the high-level language layer.
---
welcome to present day:
since we don't write code - we write intents, we also shouldn't review code either - we should review intents.
I don't review my code anymore. I ask the agent to generate markdown docs, graphviz diagrams, changelogs, audit reports, etc. I only review that.
I also ask it to write test and evaluate by whether the tests passed or not. I don't need to peak into the tests code - I can also ask plain english, pseudocode, control flow graph, whatever it is I want.
I can ask it to find errors or missing tests and improve that too!
code is like assembly now.
rare are the cases you would need to peak into that level.
That's maybe wishful thinking from my part, but more towards like other engineering fields: Project engineers design it from scratch, everyone must speak architecture, customer and compliance at once, and we will have standards and "codes" drawn up by the end of this decade.
Same here. At our company, we've pretty much stopped writing code by hand. We hand the implementation to Claude and Codex now. Feels like the real skill is moving up a level: architecture, design choices, and knowing what should be built in the first place.
I see nothing wrong with something probabilistic. I think it is all about offsetting the risk and reducing the odds of bad outcomes. There is this concept of Defence in Depth, thus I assume some sort of binomial formula also applies here.
I mean, literally the answer is that nobody knows. Maybe the robots replace us all. Maybe they shift those who remain into being some combination of Product Manager and QA. Maybe there's still a role for a technical overseer even in the medium-long run.
But it sounds like you're really asking about the state of the world today. If so, I don't think that ideal state is like your friend's company (or at least, as it appeared to be to you). It might be possible that you can make that "dark factory" pattern work (StrongDM seems to be doing it), but it would require infrastructure and discipline that I doubt they're mustering. Think about how CD didn't involve taking a sloppy build process with no testing or observability and just going straight to prod -- it required building up a lot of infra and discipline first.
But on the other hand, I don't think the ideal present involves artisan hand-crafting code either. I haven't written a line of code by hand in enough months that it would genuinely feel weird if I were to try to program that way despite decades of having done just that. That era's done with, and moderate normie practices right now today are more about supervising and guiding agents than about chiseling code into clay tablets.
This is the 2020s re-enactement of the early 2000 WYSIWYG editors.
From what you said: Not looking at code is bad, not because Claude can slip a few bugs (it can), but because LLMs tend to default to writing more code and features than needed, which isn't a good thing. I see a lot of people making 10+ PRs per day, but most of them are just going back to fix earlier PRs.
Claude always likes to "go big," for example, by choosing tools that can support millions of concurrent users or by adding unnecessary layers of abstraction that create more maintenance pain. I guess that's good for LLM companies, since more tokens are spent fixing the mess it caused.
Every time I enter plan mode for a huge feature, I end up cutting about 30-60% of the task scope before the LLM can actually start the work. I review the final code, and I still find things to cut. As said before "The best code is no code, or code you don’t have to maintain" [0]
0: https://www.simplethread.com/20-things-ive-learned-in-my-20-...
> it feels like software development is going from a precise occupation that requires high degree of understanding to something probabilistic and offloaded understanding
To me it felt like there were always engineers and vibers.
Vibers don't work systematically, never test, accept unknown regression, don't use git, and if they do, treat it like Dropbox, use terrible languages, have terrible habits. Vibers got a new tool, too, and it vastly increases the amount of slop. But the slop was always there.
The slop actually got better after vibers stopped writing their own slop, I have to say.
And vibers are less defensive about the particular slop they didn't write themselves.
"This is exactly why we built AINAScan — we found that AI-generated code passes all tests and 'works', but consistently produces the same 15 structural bugs: save functions that never write to DB, async functions with no await, parameters that have zero effect on return values. Linters miss all of these. The code looks fine until production."
No mention of whether the product is actually good.
We're still running the race, but it's just not on foot anymore. You can still run it into the wall if you're not careful where you're going.
My profession is not, and never was, _programmer_. Lines of code—the actual text, is a means to an end, not an end in and of itself. I'll take heat for that here for sure. But do you think a carpenter considers himself "one who screws nails" or "glues joints"? No, the small minutiae of the job was never the job itself.
> But do you think a carpenter considers himself "one who screws nails" or "glues joints"?
Sure, but a carpenter who is unable to use a screwdriver without hurting themselves is unlikely to produce robust furniture with a a powered driver.
I think the genie gets put back in the bottle, at least partly.
I don't think the future is massive data centers running at a staggering loss to generate questionable code.
The future is rethinking IDEs to have local models work in partnership with the developer to ease tedium and catch mistakes.
A model that maintains a visual, zoomable mind-map of the entire project, with two way binding. Code can be created visually or textually, same with data flows.
Project structure and architecture are presented in high-level ways, that can be easily altered and refactored with almost zero tedium.
I think we start using AI for what it's good for: pattern matching and transformation, and stop trying to make it reason and pretend like it's a human.
Once we, as an industry, figure this out we'll unlock a massive boost in quality and productivity, but it looks like there will be some painful times ahead before everyone realizes that the token extrusion machines are only increasing the total cost of ownership, and they are being used incorrectly when we try to outsource our thinking to them.
I think there's an enormous opportunity to build these tools right now, and that whoever nails it will win.
There was a reddit thread earlier very similar some interesting comments there too:
https://www.reddit.com/r/technology/comments/1ueidyv/softwar...
> I had an interview where I was asked the obligatory “what’s your Al workflow” and I said I use it for searching documentation and writing small functions or boilerplate that are tedious. Then I was asked whether I use Cursor. I said no, and immediately was told that “I’d be a better programmer if I used Cursor”. I have 13 years of software engineering experience, and was talked down by an Al startup with no minimal viable prototype. Then I was told I did not have the experience for the role. I love this timeline so much
how is that company doing?
i think that is a more important question that you shouldn't ignore.
do they have growing revenue?
And more important, how will they be doing in a year or two?
This has always been a very different profession depending on where you work and what you're working on.
I haven't worked at a startup in over a decade, but the stories I hear now sound the same as back then. There's lots of wasted effort for mediocre to poor code destined to be rewritten or thrown away until there's enough investment to justify more work. At which point, "more work" just means more sprawling slop instead of fixing the technical debt rotting at the foundation.
AI just put a spotlight on the futility of trying to run before you can walk. Whether so many founders are going to stay in denial about it is yet to be seen. Statistics about any line of business says yes. This is how most businesses fail and most of them have to fail.