Hacker Newsnew | past | comments | ask | show | jobs | submit | hombre_fatal's commentslogin

Well, software presumably has a goal of accomplishing something for some end-user, so the progress should be trivial to measure: are features/changes being completed?

The marketing ploys of OpenAI/Anthropic where agents build something that nobody uses might be hard to track given that there are zero users. But what about everyone using agents for real software? It's trivial to prove that agents make progress.


Yes that is the entire point. Measure features deployed in production and their value in gaining and retaining customers or users, cost reductions, reduced incidents and outages, etc.

Lines of code is completely irrelevant as a metric.


I don't mind the workflow since I'll spawn new agent sessions in new terminal tabs until my attention is saturated by round-robin'ing through them.

It's actually kinda pleasant, especially when I consider all the tickets I'm not excited about doing. It's prob worth focusing on that aspect of it.


My job these days is listening to Opus 4.8 (max effort) and Codex 5.5 (max effort) talk back and forth, particularly to generate/review/revise plan files.

Fable 5 has been a major improvement in high-level reasoning, like taking a plan file that has been optimized to the point where neither Opus nor Codex can find anything to change about it (neither in direction nor impl-detail), and Fable 5 will find high-level directional simplifications and pivots, or it will consider the best pivots itself and explain why it rejected them in favor of the plan's direction.

It's so expensive though. A single review of a plan file with Fable 5 (xhigh effort) will use 2-3% of my hourly limit on a $200/mo plan.

I think my new workflow is to generate the initial plan with Opus 4.8 (max effort), get Fable 5 (xhigh) to review it for directional feedback, then start the Opus<->Codex revision loop from there.


How do you arrive at that split? Real world is more like senior high level planning, implementation to juniors, review senior. Does this not translate?

Ideally I'd have Fable 5 make the plan, but creating a concrete plan is the most token-expensive part since the agent has to do the most research.

Fable 5 is 2x the cost per token of Opus 4.8, and it's much less work to review a plan than generate one.


Given the amount of love, energy, and attention it takes to raise a kid, I don't see why I should care how selective somebody else wants to be.

This sort of thinking is incredibly dangerous to society at large, to say nothing of the danger to your own heart and soul.

Character, beauty, love, sacrifice. Every one of these involves pain and it makes life worth living. You can't avoid pain, so you might as well engage it in service of those you hold dear.

You should very much care about society wielding a sword like this, because historically we do not wield it well.


There really is no point in leaning into avoidable pain. Pain does not "make you stronger". Pain is not "beauty". These are all bullshit tropes invented by abusers to keep people from questioning things.

I agree with you, unnecessary pain is not virtue. But pain does indeed make you stronger, or it can make you bitter, or depressed, or insane. How you choose to work through the inevitable pain you will face is what determines if it makes you stronger or not.

It's easy to hand wave the question of pain away, but much wiser men than you and I have arrived at very different conclusions than you suggest.


The sword being just deciding which embryos you will raise?

It's very easy to demand or promote sacrifices you expect other people to make. But I don't find that to be very empathetic.

I've already seen how society is when we shame and hand-wring about the personal decisions others make, and it's not one I want for my kids. At some point you need to be satisfied with your own decisions and then let other people make theirs.


I reject the premise of "live and let live" whereby no one is allowed to suggest I live a better life nor am I allowed to suggest they improve theirs.

I can be both satisfied with my decisions and still wish better for my neighbor; these are not mutually exclusive.

I am not saying, don't choose between embryos. I'm saying, be careful because it's a slippery slope and not a slide you want to ride.


Let's set the standard at development and genetic disorders.

Eliminating Down syndrome and cystic fibrosis, for example, seems not only reasonable but a moral imperative.


Yeah, "Fiber: good source" when 100g of raw bean sprouts gives you 1.8g fiber (less than a 2" kiwi), and pho comes with much less than 100g of sprouts.

Pho is a pretty bad source of fiber.

It sucks that we're skipping over such good tools like cronometer.com to figure out what we're actually eating and going straight to hallucination, adding more confusion to nutrition.


I constantly open and close terminals too. Maybe I'm doing a quick lazygit check on cwd. Maybe I'm opening up an ephemeral claude/codex session for a couple questions about why a test failed. Or quickly editing a file with vim. Or remembering where I put that file with yazi or fzf. -- I don't even know, but all of it is contingent on it being fast to open a new terminal in cwd.

So much so that I vibe-coded my own terminal emulator for vertical tabs on macOS (using libghostty for the terminals) that is faster and less weird than iTerm.


Constantly. I think we've used the excuse of "well, what if you just launch it less often?" enough to excuse bad performance defaults, especially when alternative solutions fix the issue with very few trade-offs.

Just this week I was stuck in this state where my AirPods were receiving audio from the iPhone in my pocket (intended), but play/pause commands from the AirPods were sent to my Macbook in the other room.

Nah, technology advancement is full of free lunches.

It breaks our ape brain intuition that anything good must also be bad. But consider all the food tech you take for granted while singling out zero-cal sweeteners.


It always seemed like a weird default to let people (esp strangers) submit PRs that weren't tied to an issue nor approved.

What do you mean you just spent a week implementing something in secret?

AI makes it extra silly because now you can craft up your unsolicited code change in minutes, making it extra obvious that code changes should spawn from real discussion and agreement.

TFA is part of looking for new processes that actually work. Dunno why people are having such rose tinted glasses about pull requests. Open an issue, talk to people. Have an idea? Then get people to cosign it.


I think it was different pre-AI. Someone might come in and spend days getting some understanding of the codebase before they contribute some minor fix. Over time they might stick around a make some more of these, progressively gaining trust so when they do take on something bigger the maintainers will know they aren't wasting their time reviewing it.

Now they can drop a multi thousand line poorly understood PR day 1.


As someone who maintained FOSS libraries pre-AI, I think the frequency might have changed, but large drive-by PRs with thousands of changes happened before too, I've been on the receiving end of those many times. Usually they fundamentally change the architecture too, then the submitter get offended/sad/surprised when you tell them you impossibly could accept it and they should stop wasting their time contributing without discussing first. Usually ends with some threats how their fork will take all the contributors or something like that.

What I don't get, is why these LLM users aren't asking their LLM for how to contribute and how the project prefers to contribute, and how they can make sure it's accepted? Literally, the very same tools they use to code, can be used to make sure their PR follows all guidelines, from discussions to acceptance of the PR itself, it's right there, they literally just have to prompt for it! Such a lazy group of people.


Good faith PRs were also suffering under the current model. Ive opened PRs by hand on small projects to try and fix personal issues that probably affected others. Then the PRs languish for months or in one case literal years under the deluge of ai slop being spammed at the repo. I’m not going to ping the maintainers constantly when I know they are struggling so I’m left running my fork and no one else gets the benefit.

Big projects pre-AI also can have hundreds of rotting PRs. It's a lot of work to go through them, and unsolicited PRs are kind of the wrong way to spend time as a maintainer.

AI just makes it so obvious how bad of a process it is that we can't ignore it anymore, and now we need to finally figure out good processes.

Even little stuff like: I've created issues on the Claude Code github that got agreement and then led to code changes. Why isn't there a default, built-in way for my issues to rise above the zero-effort chaff? If you finally do the work of vetting someone's PR, why isn't there a built-in (hidden) way to +1 someone so we can see that they have some reputation with the project on their future issues/PRs?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: