It is probably a both question. If 100x is the goal, they’ll have to double up the efficiency 7 times, which seems basically plausible given how early-days it still is (I mean they have been training on GPUs this whole time, not ASICs… bitcoins are more developed and they are a dumb scam machine). Probably some of the doubling will be software, some will be hardware.
I'm pretty skeptical of the scaling hypothesis, but I also think there is a huge amount of efficiency improvement runway left to go.
I think it's more likely that the return to further scaling will become net negative at some point, and then the efficiency gains will no longer be focused on doing more with more but rather doing the same amount with less.
But it's definitely an unknown at this point, from my perspective. I may be very wrong about that.
> It’s not at all, energy is a hard constraint to capability.
We can put a lot more power flux through an AI than a human body can live through; both because computers can run hot enough to cook us, and because they can be physically distributed in ways that we can't survive.
That doesn't mean there's no constraint, it's just that the extent to which there is a constraint, the constraint is way, way above what humans can consume directly.
Also, electricity is much cheaper than humans. To give a worked example, consider that the UN poverty threshold* is about US$2.15/day in 2022 money, or just under 9¢/hour. My first Google search result for "average cost of electricity in the usa" says "16.54 cents per kWh", which means the UN poverty threshold human lives on a price equivalent ~= just under 542 watts of average American electricity.
The actual power consumption of a human is 2000-2500 kcal/day ~= 96.85-121.1 watts ~= about a fifth of that. In certain narrow domains, AI already makes human labour uneconomic… though fortunately for the ongoing payment of bills, it's currently only that combination of good-and-cheap in narrow domains, not generally.
* I use this standard so nobody suggests outsourcing somewhere cheaper.
Honestly I think the opposite. All these giant tech companies can afford to burn money with ever bigger models and ever more compute and I think that is actually getting in their way.
I wager that some scrappy resource constrained startup or research institute will find a way to produce results that are similar to those generated by these ever massive LLM projects only at a fraction of the cost. And I think they’ll do that by pruning the shit out of the model. You don’t need to waste model space on ancient Roman history or the entire canon for the marvel cinematic universe on a model designed to refactor code. You need a model that is fluent in English and “code”.
I think the future will be tightly focused models that can run on inexpensive hardware. And unlike today where only the richest companies on the planet can afford training, anybody with enough inclination will be able to train them. (And you can go on a huge tangent why such a thing is absolutely crucial to a free society)
I dunno. My point is, there is little incentive for these huge companies to “think small”. They have virtually unlimited budgets and so all operate under the idea that more is better. That isn’t gonna be “the answer”… they are all gonna get instantly blindsided by some group who does more with significantly less. These small scrappy models and the institutes and companies behind them will eventually replace the old guard. It’s a tale as old as time.
Deepseek just released their frontier model that they trained on 2k GPUs for <$6M. Way cheaper than a lot of the big labs. If the big labs can replicate some of their optimisations we might see some big gains. And I would hope more small labs could then even further shrink the footprint and costs
I don’t think this stuff will be truly revolutionary until I can train it at home or perhaps as a group (SETI at home anybody?)
Six million is a start but this tech won’t truly be democratized until it costs $1000.
Obviously I’m being a little cheeky but my real point is… the idea that this technology is in the control of massive technology companies is dystopian as fuck. Where is the RMS of the LLM space? Who is shouting from every rooftop how dangerous it is to grant so much power and control over information to a handful of massive tech companies, all whom have long histories of caving into various government demands. It’s scary as fuck.