Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Qwen 3.5 27B is dense, so (I think) should be compared to Gemma 4 31B.

Or Gemma-4 26B(-A4B) should be compared to Qwen 3.5 35B(-A3B)



Exactly, compare MoE with MoE and dense with dense otherwise it's apples and oranges.


Its coding to coding. I could care less how the model is architected, i only care how it performs in a real world scenario.


If you don't care about how it's architectured, why you care about size? Compare it to Q3.5 397B-A17B.

Just like smaller size models are speed / cost optimization, so is MoE.

G4 26B-A4B goes 150 t/s on 4090/5090, 80 t/s on M5 Max. Q3.5 35B-A3B is comparably fast. They are flash-lite/nano class models.

G4 31B despite small increase in total parameter count is over 5 times slower. Q3.5 27B is comparably slow. They are approximating flash/mini class models (I believe sizes of proprietary models in this class are closer to Q3.5 122B-A10B or Llama 4 Scout 109B-A17B).


The implication is that there is (should be) a major speed difference - naively you'd expect the MoE to be 10x faster and cheaper, which can be pretty relevant on real world tasks.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: