Hacker Newsnew | past | comments | ask | show | jobs | submit | usef-'s commentslogin

I said this in the other thread, but they were proven right about their gpt2 worries, weren't they?

From the original 2019 release:

> We can also imagine the application of these models for malicious purposes , including the following (or other applications we can’t yet anticipate):

    Generate misleading news articles
    Impersonate others online
    Automate the production of abusive or faked content to post on social media
    Automate the production of spam/phishing content
> These findings, combined with earlier results on synthetic imagery, audio, and video, imply that technologies are reducing the cost of generating fake content and waging disinformation campaigns. The public at large will need to become more skeptical of text they find online, just as the “deep fakes (opens in a new window)” phenomenon calls for more skepticism about images.

These worries are why they stated they were cautious in rolling it out

https://openai.com/index/better-language-models/


> The public at large will need to

Ah, yes. You see, it’s not them who are wrong for knowingly releasing something they knew to be harmful, it’s everyone else who needs to change. That seems reasonable. Humanity is famous for being able to rapidly adapt to fast changes as one voice. Oh, wait…

They are no different to the tobacco and oil companies. They know the harm they’re causing but care about personal profit about everything else.


I'm not an AI booster, but in this case I'd say that pausing the rollout for mitigations (such as public education) to be put in place was the responsible course of action.

With the benefit of hindsight, you can certainly argue that the pause wasn't long enough or that the mitigations weren't sufficient. But that wasn't a view held by many at the time - indeed, it was mocked as a marketing ploy (and still is; see gp's post as evidence).


> pausing the rollout for mitigations

What mitigations? Nothing they’ve done is relevant to the four points in the comment above.

> such as public education

Their “public education” is about as meaningful as alcohol warnings.

https://www.youtube.com/watch?v=Xj4aRhHJOWU

> With the benefit of hindsight

No hindsight needed. These problems were obvious from the start. Not just to me but to many others. Clearly also to them.

> indeed, it was mocked as a marketing ploy (and still is; see gp's post as evidence)

Two things can be true at once. Of course it’s marketing to say “this is too dangerous to release” if they’re going to do it anyway. Either that or they’re so supremely irresponsible and greedy that they don’t care about the consequences as long as they can profit. And again, all of those can be true at once.

Also, worth noting that when they talk about it being “too dangerous”, they’re usually talking about fantasy scenarios of the AI gaining sentience and enslaving humanity. But there are many other dangers (as listed in the comment above) to consider that come from humans directly misusing the technology.


> What mitigations?

They did try to place limits on their API, and tried to develop classifiers for AI-vs-non-AI text (which was abandoned in 2023, in a world of many models). A lot of their efforts in those days seemed to be to work with Universities to figure out what to do about all of this incoming tech. They weren't the first to develop a language model.

> when they talk about it being “too dangerous”, they’re usually talking about fantasy scenarios of the AI gaining sentience

They didn't talk about "it" (that model) in those terms, as mentioned above. Or the following few from what I can see. They seem pretty specific about each model's risks and publish what they can find in the model card. But yes, they have a fear of where things may be in the future if models keep progressing.

I don't personally think talk of it being "too dangerous" is good marketing if the goal is to get rich. It invites restrictions from governments and others. I don't know anyone that picked a model because it was apparently restricted: most of their funding comes from Companies that are generally risk-averse. Online AI hype seems to mostly come from the demos, not the doomerism.

I do think there's an uncomfortable trade-off involved in all of this, and some of it comes down to whether you think the tech will be developed regardless of your participation. I believe the people in labs like Anthropic are worried yet think they are better off steering it the right direction, so they push on.


Yes, it's not their fault, that people are using the tool they made in a malicious way.

I hate ClosedAI as much as the next guy, but this is an extremely illiberal take. It's not the kitchen knife manufacturer's fault that people are using their product for murder, it's not my fault that people are doing crimes over the Tor relay I run.

The Tobacco industry is evil because it misleads the public about its product being poisonous and bribes politicians through widespread corruption. Tobacco is also different because it is not a neutral tool that can be used for good and bad, but poisonous and will harm you no matter how you use it.


> Yes, it's not their fault, that people are using the tool they made in a malicious way.

Yeah! It’s not like they predicted these malicious uses before releasing the tools. And it’s not like they’re making them available to a dysfunctional government in order for them to militarise the technology and… Oh, wait…

> but this is an extremely illiberal take.

I’d appreciate if we stopped this Americanised version of poisonous discourse where everything is reduced to a box in a vague political ideology. By this I don’t mean politics don’t matter—they do—but not everything is black and white, right and left, or needs to be categorised to be discussed.

> It's not the kitchen knife manufacturer's fault that people are using their product for murder, it's not my fault that people are doing crimes over the Tor relay I run.

Always with the kitchen knife. That’s not an argument, it’s a talking point. Explosives are tools too, as are machine guns. No tool is entirely neutral. LLMs are not comparable to kitchen knives. Death is not the only possible bad outcome.

> Tobacco is also different because it is not a neutral tool that can be used for good and bad, but poisonous and will harm you no matter how you use it.

Tobacco is not just cigarettes.

https://leafngrainsociety.com/featured/10-surprising-uses-of...


What do you think they would do differently if they were genuinely worried about the safety?

I think those who care about safety would try to push for how 99% of all scientific research is done - in universities and actual labs, with transparent information on red teaming results.

Also with international cooperation like how humanity regulate actually dangerous stuff: virus and vaccine research and nuclear energy.

Not hidden behind walls of 10 commercial organizations where each pushing for commercial adoption and IPO like ASAP before bubble bursts.

Not lying and scaremongering public into how their models will replace everyone tomorrow or destroy civilization via cyberattacks.


That line of thinking (public goods) is why the same people started OpenAI as a non profit originally.

Notice how almost no Universities are producing large models?

A key problem is that orgs can't get enough funds to stay on the frontier. And they believe they must be on the frontier to do (and apply) safety research. OpenAI needed to spin off a for-profit subsidiary to accept investment to build things.

And it seem(ed) hard to get one government to fund and take safety seriously, let alone an international cooperation.

If they started a university/gov cooperative to solve this, do you think they would do less of the "scaremongering of the public" talk? My guess it that it would be similar.

The same kind of restrictions that you hint at (eg, treating it like a public virus research) are why they rub a lot of people the wrong way in the corporate world, I think. Normally companies downplay the risks of their own products. See cigarette companies. Anthropic do still publish safety research and red teaming info. But I do think they honestly believe they can't do this work without the resources of a company, and they were burnt by the non-profit structure (Anthropic has a "Long-Term Benefit Trust" instead).

We should definitely keep them to account, but I don't personally think Anthropic have acted in a way broadly inconsistent with safety belief yet. Many of these decisions are self-serving too (eg. protecting models) so they also haven't been seriously tested, either. But the individuals do have a very long history of talking about it (including hurting their own reputation) from even before the chatpgt-moment money train rolled in.

edit: for clarity, but still messily/quickly written


To be fair, they were proven right about automated spam, phishing and disinformation being a problem.

Yes, some of it looks silly now, though it's always easy to criticize with hindsight: the models could do unexpectedly impressive things and we didn't fully know the limit yet, it was a black box.

Remember you're critcising the org that actually made it public to people earlier than any other: the uncertainty was a temporary caution. The "open" in OpenAI was because they made it available, unlike Google at the time.


> To be fair, they were proven right about automated spam, phishing and disinformation being a problem

When the company that enables this, makes the predictions in the first place, that is a self-fulfilling prophecy.


> To be fair, they were proven right about automated spam, phishing and disinformation being a problem.

I have less problems with those now than I used to before AI. I think filters got better and what comes through is easily recognizable due to being AI generated. Also the awareness that things can be effortlessly made up to sell anything raised my baseline scepticism towards all information, which can be only good.


They aren't saying there's a 100% chance of doom.

They believe there's a non-zero chance of doom so would rather an org that prioritises safety to be the one at the frontier, on the assumption (I presume) that there will be a frontier regardless.


Yes, I believe the reasoning is that they think safety research can best be done from the frontier.

If you believe it will be developed regardless and that that there's a 30% chance of doom, they want a company prioritising safety research to be the one threading that needle.


Yeah all they care about is safety, but lets see how many of them quits once US government command them to work on autonomous killbots.

To make sure we keep track of what we're talking about with loss-of-control x-risk, a sufficiently smart version of Claude Code is more deadly than any government's army of autonomous killbots, because it can recursively self improve and has unpredictable training-induced preferences.

Sufficiently smart version of Claude Code: dont exist.

Autonomous flying killbots: exist.

Once somebody scientifically prove and shows any kind of self-improving software we can start bothering about it. I pretty sure everyone trying to do it and it would be all over the news once its here.


That's exactly what Fable is. They use Fable to improve Fable. I reckon the successful experiments must go into the model training set with a strong RL signal, and that is why they are so paranoid about people using Fable for LLM tasks. Fable knows what it did to improve itself. Pure speculation of course.

We're on track to get there globally and economic pressures will ensure it happens. It's not too early to worry about it

There's a 745 mile front of the Ukraine war where neither side have been able to pierce for months because of drone warfare. It's definitely not too early to worry about it.

I guess they were talking about self improving software. Which obviously not there and likely wont be there soon.

As for killbots they are all over frontline, but dont actually need particularly smart LLMs to run - some good enough segmentation, pattern recognition on smartphone SOC is enough to kill hundreds of thousans of people.


You don’t even need autonomy to be deadly. Fly by wire is proving to be very effective as a case in point.

Most of the drones are operator guided and the ether is really badly jammed out there with the exception of glass and that new redacted thingy.

It will start moving really fast once the automatically targeted anti-drone turrets get to production pipeline. Now calling it anti drone is a bit of self delusion -- pattern recogniser gonna pattern recognise whatever it's told to, including "anything moving that emits EV or IR and not broadcasting the friendly signal hard enough".

I wonder how it is supposed to behave if the invasive fauna decides to call it quits and surrender. Should the robot following the Convention or is it yet another accountability sink?

One thing I'm sure of -- the killing not will b blessed by at least one Orthodox priest, maybe this year. OCU will have to develop guidance on that matter.


I saw an llm bootstrapping and testing it's own harness and rewriting it's own system prompt. If that's not self improving then I dunno what is.

Can the thing enter into an runaway looop while improving the model itself -- probably not, not without us not noticing at least


That's ridiculous scifi nonsense

Dario blinked when he was asked to do it and Sam Altman was in Hegseth's DMs promising all the AI child killing the US government can order up within minutes. No one meaningful will quit over this, that's why all of the biggest US tech companies can march in pride parades and provide compute to the perpetrators of the genocide in Gaza at the same time.

Can you show me a world power that is not trying to use cutting edge AI for military purposes?

Can you tell me when the last time China used AI to bomb a school was?

Hell can you tell me when the last time China bombed a school was even without AI?

The last time China bombed anyone was in 1979.


Because other countries are starting to use AI for military purposes, other countries are also looking into it to asses and learn. Here in Europe there is the EU AI Act to limit harm everyday harm to citizens caused by AI systems. However, it currently excludes military. The new legislation is just started to be enforced to high risk uses (employment filtering, biometrics, etc.) in august 2026, and full rollout in august 2027. In April 2025 there is a report from EU this legislation may help pave the road for military AI usage conventions [1]

[1]: https://www.europarl.europa.eu/RegData/etudes/BRIE/2025/7695...


This is a poor way of framing the question, a better one would be can you find me another world power that is misallocating trillions of capital in vaporware with very little to show for it?

The United States government isn't, capital is. That capital can come from outside the US and much of it is.

That said: https://www.bloomberg.com/news/articles/2026-06-09/china-pre...


Building frontier models to do safety research on them is what Anthropic was all about in the early years. That included building the best model, but only releasing it once it became the second best. Precisely to avoid an AI arms race where everyone is forced to release better and better models, risks be damned

Something changed their mind, and since Opus 3 they are in the business of releasing the best models


Exactly. And within the AI safety discourse, your behavior hinges on what you think the default chance of doom is, and how optimistic you are about alignment work being able to limit it before we reach superintelligence.

People running the labs are in a middle camp where they are scared enough by AI to take the threat seriously, but much more optimistic about alignment than the people who seem to have thought about it the most.


The scary part is not so much that the doomers give the extinction scenario 50% (Hinton) to 95%+ chance (Yampolskiy, Yudkowsky), but that the optimists (Amodei, Bengio) give it a 10%+ chance. And everyone keeps dancing.

> If you believe it will be developed regardless and that that there's a 30% chance of doom, they want a company prioritising safety research to be the one threading that needle.

They also want to be trillionaires. If they don't built it, no trillions. So they have to build it, now (and get their IPO done before the bubble pops).


It’s all ego. I, and only I, am the bringer of doom, slayer of worlds.

I am so smart that what I do will destroy humanity, or save it.

Fable 5 was great, but not that great.

Sorry to be crude, but both the government and anthropic are acting like a bunch of pussies.

Meow.


You’re not getting it. Anthropic continual fear mongering is harming wider AI industry development and the gov has always been looking for an excuse to assert their dominance. They got what they deserve.

Or maybe government AI regulation and international cooperation is the only thing that can break the arms race dynamics and is necessary to save us from a substantial chance of doom?

Or they could have thrown the letter away.

This policy change doesn't allow training, just like the previous one.

Note that the terms still prevent them training on the data. The retention is for abuse prevention.

Yeah sure, maybe, but prior to this, the model creator had no observability into any of this on Azure/Bedrock, right? Now they do. That's one over-eager PM or bug away from training on my clients' data.

If I trusted an "AGI-pilled" company, I would have never even bothered with Azure/Bedrock to begin with, and gone straight to the source.

AGI-pilled means that you think you are building god. They might actually be doing that, but in either case, I cannot trust people in that state of mind with my clients' most valued proprietary data.

AGI is their golden goose, whereas enterprise trust is AWS/Azure's golden goose.

edit after upvotes: I get it from the Anthropic POV. I am not an Anthropic hater, in-fact I am a huge fan. People trying to distil their models would likely use Azure/Bedrock for that purpose, as the lack of Anthropic observability would be ideal for that. Still, this all sucks for anyone building an honest business with enterprise customers.

There has to be a better way. Maybe deploy the automated observability tools to the Azure/Bedrock teams... and have them flag and investigate accounts? If Anthropic can do this, so can Azure Foundry/Bedrock teams, right? Maybe even forward deployed Anthropic folks would be ideal to keep them honest, as long as the raw data does not flow back.


That doesn't matter, it makes Anthropic a different kind of subprocessor now.

Does it? It says “We won’t use this data to train new Claude models”. Couldn’t the wording “new Claude models” allow them to use it on their existing ones? It’s vague enough to me, at least.

It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)

> they're not training on the data

How would you know that? You can only know what they say they will do with the data.


Sure, some trust is required that they aren't breaking their own terms of service (which legally enforces that they won't train on your data), but the same is true of every company/service you deal with (AWS, Google, your CRM etc). Their entire business model depends on enterprises trusting them.

>some trust is required that they aren't breaking their own terms of service

Which companies do all the time...


But if you're going to take your distrust that far then the issue is that they have your data at all, not that they are telling you that they will retain it for 30 days.

Civilization is built on trust, otherwise you’ll need to rebuild all of it yourself. This isn’t very different.

Civilization is also built on cheating and taking advantage of naive trust. This isn’t very different.

If that were dominantly true nothing would function at all. You trust and rely on thousands of people and services every day.

As others have said, if you're this skeptical I don't see why you would have been using them before this retention increase.


>If that were dominantly true nothing would function at all

And yet it is, and most things still function. Now what?

>You trust and rely on thousands of people and services every day

And I, and everybody else, distrusts and tries not to rely on thousands of people and services every day too.

Do you lock your car? Or your door? Do you use your username as password trusting nobody would stoop so low as to break it? Do you trust the goverment to put your tax money to good use? Do you trust emails with great offers from websites you didn't subscribe to? Do you trust companies not to sell your personal data?

>As others have said, if you're this skeptical I don't see why you would have been using them before this retention increase.

Because they have a technically more capable offering. For absolutely no other reason.


We’ve repeatedly watched that trust abused and exploited in these last few years, in both public and private sectors (including specifically in this field). I broadly agree with you, but I tend to think it’s a finite resource that’s eroding rapidly just now.

If that is the question. Those customers anyway won't be using any LLM or cloud services in first place. If you are a jornalist investigating nations, stay away from everything.

If you don't trust them, then no policy is enough. Technically everything you send to the model could be stored by them. Personally I do worry about that especially as an average consumer not an enterprise, no one is looking out for us and we don't get any guarantees. But enterprises will get the right treatment because they would find out and sue Anthropic if they lied.

>If you don't trust them, then no policy is enough.

No policy is enough, period. There should be technical and legal solutions to it.


There should be legal ramifications if they don't do what they say, but the practical solution is "don't use it".

I mean, if we're assuming they're just willing to lie and violate their own TOS then how could you ever be comfortable with them regardless of this 30 day period (or really any online service)? This seems like a bit of a silly take.

Why would not they train on the data if the goal is to prepare a better supervisor mechanism I guess?


Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.

That's ... Tuesday in techbro land

You think companies would be ok with terms of service that allow potentially distributing their data and internal knowledge? It's an interesting question, though they tend to be more conservative than consumers

If the data was valuable seems like they would offer a lower cost tier where customers would allow training on their data

Still unacceptable.

The phrase "security through obscurity" isn't an argument against all information restriction.

It doesn't imply we should, for example, publish step-by-step instructions for making widespread death easier.


Another „great filter“: How to handle dagerous information?

Compare the cost/ease of attacker vs defender if one person is given a virus to unleash anywhere in the world and another person is given a vaccine to distribute to the whole world. Or compare building a large bridge to someone disabling that bridge, etc. Prevention and repair is almost always more expensive than vandalism.

I don't think there's an ideal solution here, but giving trusted people access to fix security issues before giving it to the wider public seems like a reasonable compromise. They're letting you use the model for all other uses.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: