Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My recent frustration with Claude has been it feels like I'm waiting on responses more. I don't have historical latency to compare this with, but I feel like it has been getting slower. I may be wrong, and maybe its just spending more time thinking than it used to. My guess is Anthropic is having capacity issues. I hope I'm wrong because I don't want to switch.


There was a really good point in this podcast episode about the speed of LLMs. They are so slow that all of the progress messages and token streaming are necessary. But the core problem is that the technology is so darn slow.

https://podcasts.apple.com/us/podcast/this-episode-is-a-cogn...

As someone who both uses and builds this technology I think this is a core UX issue we’re going to be improving for a while. At times it really feels like a choose 2+ of: slow, bad, and expensive.


About slowdowns... I have this theory that if they sneak some sleep(1) calls while processing medium to complex prompts they can serve more clients.

But I think "context switching" between 2 different prompts might be too expensive for GPUs to be worth it for LLM providers. Who knows.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: