This is an excellent point, and as a novice using LLMs for projects I could never previously dream of doing I find myself looking for the same, examples or citations of what exactly agents are writing incorrectly and how would the human do it better. I'm sure they're out there, maybe someone can refer some good content showing such examples.
I have no doubt the top nth percent of coders could write circles around Claude or Codex, but how much worse are they than your average schnook?
Reality: the top nth percent of coders are seeing absurd, dramatic gains in productivity using LLMs. See: antirez, Simon Willison, Steve Yegge.
The more experience you bring to the table, the more value you get from these tools.
Look, about 12 years ago articles about how if you're not pair programming you're doing it wrong were on HN's home page every day. Doing well prompted plan -> agent -> debug cycles is like pair programming with someone that knows every SDK and API intuitively and doesn't have to pick up their kids from daycare at 4pm.
While I don't actually disagree - to me, Gas Town sounds literally insane - I suspect that if you reframe his work to compare it against the cost of developing a new medication or chip fabrication technique, you can make a strong argument that he's putting his money where his mouth is to see how far he can take a new technology. He's doing science! And I think that's admirable, even if nothing comes of it.
When I think of how much money gets wasted on gambling apps and how much human potential gets wasted watching reality television and compare that to Steve going full Alexander Shulgin with LLMs, the comparison really falls flat.
The problem is what they do to large existing systems: subtle misunderstandings mean subtle bugs are constantly being introduced, and very few shops have adequate systems in place to receive reports of subtle issues at the rates they occurred 10 years ago, let alone today. And don't even get me started on llm-assisted support that some might suggest as a solution.
I have no doubt the top nth percent of coders could write circles around Claude or Codex, but how much worse are they than your average schnook?