Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Ollama is good enough to dabble with, and getting a model is as easy as ollama pull <model name> vs figuring it out by yourself on hugging face and trying to make sense on all the goofy letters and numbers between the forty different names of models, and not needing a hugging face account to download.

So you start there and eventually you want to get off the happy path, then you need to learn more about the server and it's all so much more complicated than just using ollama. You just want to try models, not learn the intricacies of hosting LLMs.



to be fair, llama.cpp has gotten much easier to use lately with llama-server -hf <model name>. That said, the need to compile it yourself is still a pretty big barrier for most people.


I started with ollama and now I'm using llama.cpp/llama-server's Router Mode that allows you to manage multiple models through a single server instance.

One thing I haven't figured out: Subjectively, it feels like ollama's model loading was nearly instant, while I feel like I'm always waiting for llama.cpp to load models, but that doesn't make sense because it's ultimately the same software. Maybe I should try ollama again to convince myself that I'm not crazy and that ollama's model loading wasn't actually instant.


You don't need to compile it yourself though? Unless you want CUDA support on Linux I guess, dunno why you'd need such a silly thing though:

https://github.com/ggml-org/llama.cpp/releases


> dunno why you'd need such a silly thing though

I'm not sure I follow, what alternative to CUDA on Linux offers similar performance?


Ah, 'twas a mere jest, a sarcastic jab that of all the manifold builds provided, the most useful is missing - doubtless for good and practical reasons.

Nevertheless, worth looking at the Vulkan builds. They work on all GPUs!


> That said, the need to compile it yourself is still a pretty big barrier for most people.

My distro (NixOS) has binary packages though...

And there's packages in the AUR (Arch), GURU (Gentoo), and even Debian Unstable. Now, these might be a little behind, but if you care that much you can download binaries from GitHub directly.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: