Hacker Newsnew | past | comments | ask | show | jobs | submit | blcknight's commentslogin

One bad npm package can really ruin your day. These things for me only run in their own VM with it's own GitHub account and basically nothing else

People probably think you’re being ridiculous but Shai Hulud had its very first attempt at manipulating AI lead analysis and I know of at least one company where that resulted in them getting pwned.

This is only going to become more of a problem in the future and people need to educate themselves on the technical barriers to use because guardrails only sometimes work.


The fallback doesn't seem to be working for me, I haven't scanned a project in it immediately booted me when it found a security bug even though I didn't ask for it

Kube play and quadlets are cool

It hasn't changed, and I don't know why people are saying that most books don't have DRM. It is only a small minority.

Tor books is the largest publisher without it (owned by Macmillan). Otherwise everything is truly hard DRM either ACSM with epub or Kindle's. They are both more or less easily defeated though.


Kobo does carry DRM-free books, and I've never encountered DRM on any books I bought directly from authors' or publishers' websites either.


The vast majority of kobo books are still ACSM protected.


Because these are people that don’t buy the books anyway. They pirate them and put them onto their 2010 Chinese ebook reader.


All kinds of tools make it really difficult to not make a URL clickable and even if it wasn't clickable they might still put it in address bar...


Well, they might also just not read the page the OP made, but a change doesn't have to be perfect to be an improvement.


I am not sure anyone knows what a harness is at this point. I've heard 17 different definitions of it at this point. It's almost like a buzzword in search of a problem.


Author here. My definition is: you take an agent, remove the model and you’re left with the harness.

Tools, memories, sandboxing, steering, etc


Clean definition, stealing it. Way better than mine: "Now imagine Claude as Shinji and Claude Code as Eva..."


Huh. My definition - or rather, explanation - has always been, "The model is just a big bag of floats you multiply with some numbers to get some numbers out, plus a regular program that runs a loop which, at minimum, turns inputs (text, images) into a stream of numbers, pushes it through those multiplications against the bag of floats, and turns results back into text/images/whatnot. That regular program is called a harness[0]. Now, the trick to make LLMs into agents, is to add another loop in the harness that reads the output and decides whether to send it out to user, or do something else, like executing more code (that's what tools are), or feeding it back to input with some commentary (that's how you get "thinking"), or both (that's how you get the "agentic loop")".

Because there isn't really much more to it. And ever since we, i.e. those of us who played with ChatGPT API early on, bolted tools to it, some half a year before OpenAI woke up and officially named it "function calling" - ever since then, we knew that harness was the key. What kept changing was which logic (and how much of it) to put in explicitly, vs. pushing it back to the model on the "main thread", vs. pushing it to a model on a separate conversation track. But the basic insight remains the same.

--

[0] - Well, today - until recently you'd call it a "runner" or "runtime".


So, client?


But what is an agent without tools?


Code.


Like as in what its made out of, or what it makes? Neither really makes sense here? Lots of things are made out of code and not necessarily agents, but also (from my decidedly outside observer perspective) "agents" are not limited to being code producers either.


If you use cloud models.. the harness is what runs in your computer

AI companies would love if everything ran in their cloud, but arguably there are latency reasons or other reasons to run at least some stuff in your own computer


I don’t even know what an agent means, let alone harness.


There is an LLM API. You send it a system prompt and the conversation history. If the last message is a user message the agent will send back a response. It can also send back a “thinking” message before it sends a response and it can also send back a structured message with one or more function calls for functions you defined in your API request (things like “ls(): list files”).

The harness is the part that makes the API calls, interacts with the user, makes the function calls, and keeps track of the conversation memory.

You can also use the LLM to summarize the conversation into a single shorter message so you get compaction. And instead of statically defining which functions are available to the LLM you can create an MCP server which allows the LLM to auto-discover functions it can call and what they do.

That’s the whole magic of something like Claude Code. The rest is details.


I'd say the core is that the harness/runtime/${whatever you call it} doesn't just unconditionally sends model output to the user, and user input to the model, +/- some post-processing, but instead runs a loop that feeds the output back to the model if some conditions are met. That gives you basic "thinking" and single "function calling" a-la early ChatGPT. However, if you allow it to loop arbitrary number of times and allow the output to decide whether to loop or to stop, you get a basic agent.


Agent is currently defined as "what I want it to mean given whatever I am talking about".

Personally, for me it embodies a level of autonomy. I define that as, an AI model with potential to interact with something external to itself based on its output, where that includes its own future behavior.


the agent harness is the REPL. The evaluation + loop.


Trump just needs to mint a 39T coin with his face on it.


Fight enshittification. For whatever reason, many travel sites no longer send full details in the e-mail confirmation, they want you to click through to the site...which means I can't forward it to plans@tripit.com for automatic import.

Immediately after booking something,I tell Gemini to add it to my TripIt. Works great. I have a little prompt explaining how I like it formatted that I cut and paste, so I can just make this a one-click prompt. I could also have it add flights to my.flightradar24.com.

I also use Gemini in Chrome to add appointment confirmations to my calendar. Or remember things in Google Keep.

There's lot of use cases for this kind of thing.


Lots of people are flocking to Claude and ChatGPT -- and making Gemini more useful in the browser everyone already has makes a lot of sense.

More Google use, more data they gather, more ads they can show you


"Lots of people" -- actual numbers will be helpful.

If you look at market share, Google the search product barely changed.

In terms of financials, Alphabet is earning more than ever on ads, according to earnings.


To chatgpt? Claude I believe, but chat?


For the love of god fix bugs and write some fricken tests instead of dropping new shiny things

It is absolutely wild to me you guys broke `--continue` from `-p` TWO WEEKS AGO and it is still not fixed.


--resume works fine?


That's why I mentioned `-p`.

`--continue` and `--resume` are broken from `-p` sessions for the last 2 weeks. The use case is:

1. Do autonomous claudey thing (claude -p 'hey do this thing')

2. Do a deterministic thing

3. Reinvoke claude with `--continue`

This no longer works. I've had this workflow in GitHub actions for months and all of a sudden they broke it.

They constantly break stuff I rely on.

Skill script loading was broken for weeks a couple months ago. Hooks have been broken numerous times.

So tired of their lack of testing.


Been working fine for me with -p


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: