More

ComputerGuru · 2026-06-08T20:10:17 1780949417

I used Gemini exclusively via the API but downloaded the app last week for something. Even on max settings, it is ridiculously nerfed!

hypfer · 2026-06-08T21:41:51 1780954911

Unfortunately, even the API variant got RLHF'd pretty hard into being that dumb end-user assistant personality :(

But beside that, I feel like the app variant got worse the day they've had that wwdc-style release thing recently.

Previously it was a sparring partner that could actually keep up. But now it just doesn't.

Truly a shame. And nothing that could be fixed by local models any time soon, given that you need the size for the (cross-)domain knowledge.

ComputerGuru · 2026-06-06T15:09:28 1780758568

It's saying it's better than naively truncating the QAT release to 4 bits.

ComputerGuru · 2026-06-06T15:06:54 1780758414

I'm not surprised Chen's patch was rejected; that's an extremely niche usecase not worth supporting. With my shell developer hat on, I agree with the closing "developers would likely welcome a native implementation that isn't (unlike the current implementation) hiding fork() and exec() under the covers".

smj-edison · 2026-06-06T15:19:44 1780759184

It sounds like they're interested in the concept though, just not that specific implementation.

sanderjd · 2026-06-06T15:28:29 1780759709

Yeah this seems like a promising discussion.

Chu4eeno · 2026-06-07T00:23:24 1780791804

It has been for decades at this point. thiago's blog posts which introduced me to the topic over a decade ago (and is still one of the best explainers) points out that posix_spawn was introduced in POSIX.1-2001: https://web.archive.org/web/20120718152158/http://www.maciei...

sanderjd · 2026-06-07T00:31:32 1780792292

Yeah fair enough.

ComputerGuru · 2026-06-05T23:30:27 1780702227

How were you getting anything useful out of that? We found the (unquantized!) E2B model to be completely useless at even the simplest real-world classification tasks.

ComputerGuru · 2026-06-05T23:26:41 1780702001

So what we want now is unsloth (or anyone) to release 4/6-bit quantized models of these releases?

coder543 · 2026-06-05T23:28:23 1780702103

Yep, Unsloth already did, as linked in the comment at the top of this thread

ComputerGuru · 2026-06-05T14:58:31 1780671511

Why is this written as a marketing spiel instead of just, you know, answering the question?

ComputerGuru · 2026-06-05T14:55:51 1780671351

ADBC: https://arrow.apache.org/docs/format/ADBC.html

Seems like a columnar version of ODBC, for OLAP instead of OLTP.

ComputerGuru · 2026-06-03T17:08:35 1780506515

Quite aside from the architectural changes, I suppose this is the answer to why Google had such a glaring hole in the (pretrained) Gemma4 model lineup between the Gemma4 4b and Gemma4 26b models!

A model that comfortably fits in 16GB of VRAM (allowing room for context) is a welcome upgrade.

ComputerGuru · 2026-06-03T17:03:26 1780506206

I’m not sure I agree? For each of your examples there are algorithmic approaches and neural network approaches. Companies have certainly been loose and wild with how they market these, but there remain distinct approaches and implementations for each. Very generally speaking, the neural network based approaches (aka “generative AI”) perform better but with much worse degenerative cases and a higher baseline rate of unwanted side effects (that are normally not immediately visible but tend to cause issues down the line).

My bigger concern is that these neural network based solutions have taken the place of the former rather than supplemented them. Many tools no longer provide the algorithmic/kernel-based approach at all, and have marketed the “AI” (née ML) alternative as a strict superset/upgrade, despite its potential drawbacks.

(Interestingly while the inference-based implementations generally have higher latency (or infinitely worse, cloud and pay-as-you-go requirements), for some computationally difficult kernels the inference-based approach is actually faster!

ComputerGuru · 2026-06-02T19:30:43 1780428643

Microsoft has been releasing LLMs for years.

ipsum2 · 2026-06-02T19:32:12 1780428732

Sort of. Phi models were just trained on GPT outputs though.

kingstnap · 2026-06-02T20:13:41 1780431221

For those that don't know about this. Phi was announced with a paper called "Textbooks are all you need". What they did was use GPT 3.5 and created synthetic textbook chapters and exercises.

They also did some more interesting work like showing very small models can be coherent as long as you have very simple children's book style training data (TinyStories is pretty famous).

Lots of these ideas are still used. Learning facts at scale with active reading is an ICLR 2026 paper from Meta AI that does a lot of similar work.

not_a_bot_4sho · 2026-06-02T20:11:06 1780431066

By design. The whole point of Phi is the "textbooks is all you need" theory on curated training data, as opposed to kitchen sinks.

lemonish97 · 2026-06-02T19:45:33 1780429533

They were mostly distilled or fine-tuned OAI models.

Havoc · 2026-06-02T20:04:46 1780430686

huh? The granite series isn't distilled

wirybeige · 2026-06-02T20:15:17 1780431317

Granite is IBM

Havoc · 2026-06-03T08:45:40 1780476340

Ah snap. You’re right of course

jwitthuhn · 2026-06-02T19:34:43 1780428883

And occasionally un-releasing them like with WizardLM.