Hacker Newsnew | past | comments | ask | show | jobs | submit | InvidFlower's commentslogin

Don't forget the employees doing the actual model training and research are not the same ones coding Claude Code. CC was a side-project by one employee that ended up hitting it big and is now one of the core parts of their income. They were never a software company per se. This is like how MidJourney was just on Discord forever, because no one on the team was a real web developer. Discord made it easy to get something out there, scale up to many users, etc.

As the other person mentioned, they have said they are restricting third-party agent systems like OpenClaw and Hermes from using the monthly plan. But yeah, this seems like the wrong way to handle it, trying to detect those other harnesses via clues, and auto-changing billing. Instead, seems like it'd be better to allows vs block, like require some special encrypted signature system or similar that only Claude Code and their desktop app implement. Any other requests just get immediately blocked. Then there's a separate key for normal pay-by-the-token usage that is just unrestricted. Would make things waaaay clearer.

I'm not sure if the context limit on the $25/m, and model-size limit on the $100/m would make it not work well enough for OpenCode, but Featherless AI seems a bit unique in terms of how they handle their inference plans.

They've said publicly that they don't want apps like OpenClaw (Hermes is a variation) being used with a monthly plan vs per-token billing. The problem is this was implemented pretty badly (trying to regex??). And they should put a firm boundary between the two. It shouldn't be trying to switch over to a different billing plan automatically using the same api key.

I think they wanted to try not to totally lock down the monthly plan for non-agent uses, but that makes it all too fuzzy. They should use some specific method like encrypted signatures or something, so that anything sending to the monthly plan that isn't Claude Code or the desktop Claude app just errors out and be done with it.


Yeah, at the least it should alert the user that it is happening. Maybe the thinking was alerting it gives people signal on how to get around the restrictions, but having it silently charge from a different bucket isn't the answer either.

I think part of the issue is they were letting people use plan's API for random stuff, so people could do testing or small projects. Then the agents came along and exploded the cost, so they want to restrict those but still let some other usage, which I don't think is tenable.

I'm sure there is some way that they could enforce that all calls are coming from the Claude app or Claude Code. It might be hard to 100% enforce, with stuff running on a user's machine, but they could still could make it quite difficult, where someone has to be intentionally trying beat the system (like stealing encryption keys out of the Claude Code binary or something).


Also could run on a more generic cloud inference or gpu site. At least to see how well it works for your use-case before spending on hardware.

For the topic of remote control, Happy seems to be working pretty well for Claude Code but is also supposed to support Codex. It's a bit rough around the edges, but nice that it is open source: https://github.com/slopus/happy

I'm not so sure about that. Like we're using Claude Code with Bedrock and have most things on AWS with SOC2 compliance and all that. Normally switching to Codex would have a ton of friction in terms of separate contract and billing, worrying about data retention, etc.

But with Codex and GPT5.5 having generally good reviews, when it goes on Bedrock, it would be very simple for us to just try it out. Obviously there's the general friction of knowledge and comfort of Claude Code vs Codex, but that seems possible to overcome depending on the differential of the models and also of the features. Like if Codex gets remote control for Bedrock, that'd be a big advantage over Claude Code on it.


No, you definitely don't have access to the weights. The raw weights are secret enough that when a model hits a certain level of capability, their guidelines are that they need enough procedures in place to try to keep nation states from being able to hack in and steal the weights.

I think cost is fairly similar. As for who wants it, enterprise stuff is a big thing, but it also seems very reliable. Have never had any rate restrictions or had it just be down, which it sounds like a lot of people have issues with for Anthropic's servers.


Besides what the other person mentioned about being more useful for enterprise, I also heard mentioned on a podcast that gpt-image-2 uses the same general architecture as the LLM models, while Sora was a very different architecture. So they don't need two different sets of everything by shutting down Sora.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: