Hacker Newsnew | past | comments | ask | show | jobs | submit | more ComputerGuru's commentslogin

I used Gemini exclusively via the API but downloaded the app last week for something. Even on max settings, it is ridiculously nerfed!

Unfortunately, even the API variant got RLHF'd pretty hard into being that dumb end-user assistant personality :(

But beside that, I feel like the app variant got worse the day they've had that wwdc-style release thing recently.

Previously it was a sparring partner that could actually keep up. But now it just doesn't.

Truly a shame. And nothing that could be fixed by local models any time soon, given that you need the size for the (cross-)domain knowledge.


It's saying it's better than naively truncating the QAT release to 4 bits.

I'm not surprised Chen's patch was rejected; that's an extremely niche usecase not worth supporting. With my shell developer hat on, I agree with the closing "developers would likely welcome a native implementation that isn't (unlike the current implementation) hiding fork() and exec() under the covers".

It sounds like they're interested in the concept though, just not that specific implementation.

Yeah this seems like a promising discussion.

It has been for decades at this point. thiago's blog posts which introduced me to the topic over a decade ago (and is still one of the best explainers) points out that posix_spawn was introduced in POSIX.1-2001: https://web.archive.org/web/20120718152158/http://www.maciei...

Yeah fair enough.

How were you getting anything useful out of that? We found the (unquantized!) E2B model to be completely useless at even the simplest real-world classification tasks.

So what we want now is unsloth (or anyone) to release 4/6-bit quantized models of these releases?

Yep, Unsloth already did, as linked in the comment at the top of this thread

Why is this written as a marketing spiel instead of just, you know, answering the question?

ADBC: https://arrow.apache.org/docs/format/ADBC.html

Seems like a columnar version of ODBC, for OLAP instead of OLTP.


Quite aside from the architectural changes, I suppose this is the answer to why Google had such a glaring hole in the (pretrained) Gemma4 model lineup between the Gemma4 4b and Gemma4 26b models!

A model that comfortably fits in 16GB of VRAM (allowing room for context) is a welcome upgrade.


I’m not sure I agree? For each of your examples there are algorithmic approaches and neural network approaches. Companies have certainly been loose and wild with how they market these, but there remain distinct approaches and implementations for each. Very generally speaking, the neural network based approaches (aka “generative AI”) perform better but with much worse degenerative cases and a higher baseline rate of unwanted side effects (that are normally not immediately visible but tend to cause issues down the line).

My bigger concern is that these neural network based solutions have taken the place of the former rather than supplemented them. Many tools no longer provide the algorithmic/kernel-based approach at all, and have marketed the “AI” (née ML) alternative as a strict superset/upgrade, despite its potential drawbacks.

(Interestingly while the inference-based implementations generally have higher latency (or infinitely worse, cloud and pay-as-you-go requirements), for some computationally difficult kernels the inference-based approach is actually faster!


Microsoft has been releasing LLMs for years.

Sort of. Phi models were just trained on GPT outputs though.

For those that don't know about this. Phi was announced with a paper called "Textbooks are all you need". What they did was use GPT 3.5 and created synthetic textbook chapters and exercises.

They also did some more interesting work like showing very small models can be coherent as long as you have very simple children's book style training data (TinyStories is pretty famous).

Lots of these ideas are still used. Learning facts at scale with active reading is an ICLR 2026 paper from Meta AI that does a lot of similar work.


By design. The whole point of Phi is the "textbooks is all you need" theory on curated training data, as opposed to kitchen sinks.

They were mostly distilled or fine-tuned OAI models.

huh? The granite series isn't distilled

Granite is IBM

Ah snap. You’re right of course

And occasionally un-releasing them like with WizardLM.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: