Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1. This looks great and I love how it supports llama.cpp, what about GPTQ, exLlama or c_transformers support?

2. On a first glance, this looks a bit like LangFlow. I guess this is different, but how?

3. Is this freemium, or fully running stand-alone as OSS?



1. I would love to support additional model runners including exLlama and API based models like chat GPT. I'm less familiar with how c transformers and GPTQ compare to llama.cpp. GPTQ used to run faster because it supported GPU acceleration, but now llama.cpp supports the GPU as well so that may have changed. Feel free to open a GitHub issue to discuss this: https://github.com/floneum/floneum/issues/new/choose

2. There are a few differences: a) Floneum doesn't require any setup. No need to install python, cuda, or pop. Just download the executable and run b) It has first class support for quantized local models c) It supports fully issolated WASM plugins (not arbitrary python code)

3. Floneum is fully Open Source!


Thanks for your clarifications. I added it to my awesome list:

https://github.com/underlines/awesome-marketing-datascience/...


Exllama is significantly faster if you can fit the whole model in VRAM.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: