More

singularity2001 · 2026-06-10T10:55:12 1781088912

Most importantly, the reinforcement loop is used during training. I don't agree with Sutton's original hypothesis, but it holds even less after reinforcement learning.

porridgeraisin · 2026-06-10T11:52:46 1781092366

RLVR still does not expand beyond the base distribution though, it only mode-seeks within it.

i.e, evaluation, retention yes. variation or "planning" no.

That is not to say you cannot use LLMs. Alpha evolve does exactly that. It uses an external simple evolutionary planner. The overarching point he's making is that our planner is still "dumb" and we need to work on it.

When you iteratively guide an LLM in claude code, you are the external planner. That also works.

singularity2001 · 2026-06-08T18:29:06 1780943346

or could it be the other way around that actual privacy is forbidden in Europe because they want to read your messages

peterspath · 2026-06-08T18:34:27 1780943667

This. They come up with so many laws under so many pretences that want to take away the freedom of private communications

cromka · 2026-06-08T18:30:31 1780943431

It's not about how it is but how they made it sound. Let's not get ideological here.

singularity2001 · 2026-06-08T18:25:51 1780943151

do they not have data centers in Europe yet

robot_jesus · 2026-06-08T18:42:20 1780944140

They have some for sure for iCloud. Do they have enough to handle this volume of compute AND is Gemini allowed to be run on those? That was more what I was questioning/curious about.

singularity2001 · 2026-06-05T15:18:00 1780672680

some of the greatest ideas are proposed in a ridiculed manner first

singularity2001 · 2026-05-15T12:11:28 1778847088

"allows the important stuff like LLMs writing code so long as you disclose."

Are you sure? It says:

"It's fine to use LLMs to answer questions, analyze, distill, refine, check, suggest, review. But not to *create*."

staticassertion · 2026-05-15T15:30:53 1778859053

Yes. The policy is pretty clear on what the rules are for LLM generated code. You need a reviewer to agree to review LLM generated code, you need to read the code yourself, etc.

phire · 2026-05-16T10:18:20 1778926700

It's a pretty strict ban, with an exception.

That exception is experimental and somewhat limited; Only allows "well tested, high-quality" PRs on parts of the codebase that have a low probability of causing soundness issues, and it has a seperate review process with much higher standards.

And it requires the reviewer to agree to the use of LLMs ahead of time, before the PR is opened.

IMO, it has a high likelihood of degrading to a closed system, where some programmers with a good track record have little issue merging LLM generated PRs, while anyone without a reputation will struggle to even open an ai-assisted PR.

singularity2001 · 2026-05-05T07:37:05 1777966625

Interesting when I read the book I wanted to rename it to "How to win fake friends and manipulate people." Maybe I missed the humble passages.

singularity2001 · 2026-04-28T13:29:28 1777382968

Formal proofs are made to be done by AI.

If a green checkmark goes away so be it. AI might or may not understand how to fix it but it's no burden to the user / developer.

singularity2001 · 2026-04-28T13:25:13 1777382713

Flagged for not defining what RF engineering is

singularity2001 · 2026-04-24T13:22:54 1777036974

The US has massacred millions of people of other countries, is that better?

code_for_monkey · 2026-04-24T13:45:38 1777038338

You dont even have to look abroad, the USA kills its own citizens all the time. Police brutality is a huge issue here, we had some large protests here and the country ended those with the realization that nothing can be done about it. Kids get shot in school all the time in the US and once again, nothing gets done about it ever. The USA has a gigantic prison population and you guessed it: nothing gets done about it.

singularity2001 · 2026-04-22T08:44:48 1776847488

I thought cursor became mostly obsolete with Claude Code and Codex TUIs?

jjav · 2026-04-23T07:11:46 1776928306

> I thought cursor became mostly obsolete with Claude Code and Codex TUIs?

I wouldn't think so. At work I have both cursor and claude code and while I use both, cursor is by far the most pleasant to use. If I had to give one up, I'd let claude go.

user34283 · 2026-04-22T10:31:55 1776853915

Are TUIs not yesterday’s hot thing?

The way I work now in the Codex desktop app is that I spin up 3-5 conversations which work in their dedicated git worktree.

So while the agent works and runs the test suite I can come back to other conversations to address blockers or do verification.

Important is that I can see which conversation has an update and getting desktop notifications.

Maybe I could set this up with tabs in the Terminal, but it does not sound like the best UX.

unknownx113 · 2026-04-24T13:31:33 1777037493

That's probably more a personal preference than objective measurement. A lot of people already spent most of their dev time in the terminal, so for someone like myself that uses neovim claude code or codex cli are much easier than using the GUIs.

dmix · 2026-04-22T13:28:32 1776864512

The solution is use both. They both have their usecases. Cursor's autocomplete and quickly highlight a few lines -> throw into context, plus it's got a very good file index/API (which burns much less tokens than Claude's grep'ing) and whatever else they are doing underneath to optimize it for coding.

Claude is still gold standard if you're not in an IDE though.

kid64 · 2026-04-22T17:53:20 1776880400

Grep'ing doesn't use tokens, it uses grep.

dmix · 2026-04-22T23:08:37 1776899317

Reading files is always the biggest token burning when coding. If it can't find stuff quickly or has to use less and head to trim it before finding it, then you're just wasting context window

Cursor both lets you highlight specific lines multiple times per chat and is much quicker at finding stuff.

jmalicki · 2026-04-22T21:33:05 1776893585

Claude has to use more tokens to read the grep output.

freedomben · 2026-04-22T09:45:06 1776851106

That matches my anecdatal experience with a couple dozen devs. Many wnet hard on the Cursor train and have mostly gotten off now with CC and Codex TUIs available