More

gdiamos · 2026-06-15T07:41:35 1781509295

This is why I use a router to send my own IP to my own models, and general information to Claude.

https://split-brain-ui.scalarxlm.com/docs/clients

I expect Claude to train on my general tokens. I train my own model on my IP related tokens.

gdiamos · 2026-06-15T07:35:04 1781508904

This weekend I was reading this paper on programming the Cerebras wafer scale engine, https://arxiv.org/html/2405.07898v1 . Data movement is the expensive part of computing, and some algorithms like stencils only require nearest neighbor data movement per cycle. Cerebras wafers have very low energy transfer between neighboring processing elements on the same wafer, so they come up with a language called Tungsten that focuses on this exchange primitive in the kernel programming model.

I thought the challenge of programming 100,000s of cores using a mesh would be interesting so I wrote a simulator, simple compiler, and a few simple kernels for the wafer scale engine using publicly available documents.

I'm used to CUDA. So I asked: "How would you map something like CUDA onto a machine like this?" Well I use something like malloc to allocate global memory, memcpy to move between host and device memory, and a queue of launch thread block launches, but this time, thread blocks can communicate using nearest neighbor send/recv instructions within the same block instead of through shared memory on a streaming multiprocessor. This is inspired by the stencils in Tungsten.

The whole program is made up of a bulk synchronous kernel of many thread blocks.

I think it is interesting because CUDA has some hard limits on thread block sizes, but this mesh perspective lets you grow or shrink the blocks significantly.

Note that some information about cerebras wafer engines like the ISA is not public (as far as I know). In this code, I just guessed what it could be.

So this should not be taken as a faithful or accurate simulation of the wafer scale engine. More like a point on the design space that is similar in that it includes a wafer sized mesh of processing elements.

gdiamos · 2026-06-10T12:24:54 1781094294

What I do is route general data to Mythos, and my own IP to a local model.

I expect them to train on their traffic, and I train on mine.

gdiamos · 2026-06-07T13:18:46 1780838326

What I tell my team to do is to drop using so many cloud saas apps, and build more themselves using LLMs.

I’m not planning on firing people, but I am planning on building more, using more tokens, and less app subscriptions.

One aspect of building that doesn’t erode is human values.

LLMs don’t create software with zero direction and although I do have 12 agents building constantly, I run out of attention to increase that to 100.

zaphirplane · 2026-06-07T13:33:11 1780839191

How strange or at least unintuitive. Buying should be cheaper than creating for a customer of 1

gdiamos · 2026-06-07T13:42:45 1780839765

Think about the worst enterprise SaaS apps you have used…

zaphirplane · 2026-06-08T11:12:51 1780917171

rewrite SAS, salesforce or SAP, will never have the breadth and business know how

dominotw · 2026-06-07T13:21:06 1780838466

you dont need to vibe code shitty apps. you just need to learn how to use apps like codex, claude desktop.

gdiamos · 2026-06-07T13:44:46 1780839886

I don’t get it. That’s what I am using.

gdiamos · 2026-06-03T18:08:40 1780510120

There is demand for US open models.

literalAardvark · 2026-06-03T19:10:06 1780513806

I sincerely wonder why. Chinese censorship is only really relevant if you're doing anti China stuff, which is to say never, while the Western kind of model censorship ( a combination of copyrights and general fairness ) are something everyone's had to work around at least once, even if just for writing an interesting story.

gdiamos · 2026-06-03T22:22:43 1780525363

It’s about enterprises who care about supply chain risk and having a throat to choke if they have a problem.

Here’s a real example.

I’m in a design meeting talking about a model use case. We have a question about the data pipeline or the prompt format that would benefit from knowing about how the model was trained. The enterprise team lead calls the dev tech engineer from the company who produced the model. He is already in the office and walks into the meeting to answer the question.

gdiamos · 2026-05-27T19:51:36 1779911496

Instead of move to duck duck go I just stopped using search

gdiamos · 2026-05-24T07:31:22 1779607882

How far can a pure mercenary culture get?

m0llusk · 2026-05-24T08:41:33 1779612093

All the way to the end

gdiamos · 2026-05-21T00:56:43 1779325003

Don’t put it past Dario to buy spaceX

redox99 · 2026-05-21T01:13:50 1779326030

Elon will never sell SpaceX. And he controls 86% of votes.

gdiamos · 2026-05-19T22:48:34 1779230914

I’m seeing founders being encouraged to run their business with AI and cut out the etc etc

dalyons · 2026-05-20T03:07:51 1779246471

Sure, that’s capitals’ dream but how does that actually work out in practice

gdiamos · 2026-05-19T17:02:44 1779210164

Smart move