blitzar

> Researchers at Amazon had used a series of prompts to get Anthropic’s Fable 5 model to provide them with information that could be used to aid cyberattacks...

Are there going to be bans on things that could be used to aid in school shootings next?

show comments
Topfi

I still am struggling to understand why they informed the government about something that is known to be an issue in every LLM. There is no LLM that cannot be jailbroken, so unless this means that we have reached the absolute maximum publicly accessible US made LLMs are allowed to operate at with GPT 5.5, this is not grounded in any sane regulation attempt.

Does anyone know what limits Fable 5 has overstepped in the eyes of the government? Parameter count? Certain benchmark results? Training computer?

Cause if it’s just the ability to assist with cyberattacks and being jailbreakable, there is no model previously released that isn’t equally guilty.

Remember that for GPT 5.5 and 5.4, OpenAI also restricted the cybersecurity focused use under designated models, otherwise rerouting to 5.3-codex like Fable did with Opus 4.8. And both OpenAI models can also be jailbroken all the same.

Basically, what was the reason to tell the government now and not with Opus 4.5 or GPT 5.4? sama has been doing the rounds with apocalyptic predictions…

show comments
himata4113

First of all I found that fable is trained in a way that even if you were to jailbreak it, it would be completely uninterested in exploitation or finding creative solutions for explotation. However, I am unable to verify if this is related to them doing secretive prompt injection. Opus 4.8 is far more powerful in that regard.

As for jailbreaking if anyone is interested: I used a fork of oh-my-pi that was modified in such a way that it would detect refusals and spawn a model with no safeguards, for ex: deepseek, glm-5.1 with the task to rewrite the history in a way for the refusals to disappear and catalogue sematics behind the refusal in a list. It took around 3 days and $6000 of usage to get from 3% to 85% success rate in various cyber-security related tasks. Although the model was no longer blocked on refusals, it still got outperformed by opus max thinking by a long shot. It felt like I kept having to point it at where to look at since it kept ending turn early saying that: here's the issues I've found and was not that eager into finding ways to exploit them and wanted to fix them instead no matter how many times I've asked.

Another specific part around day 1 I quickly realized that I had to hook toolcall results and have opensource models summarize the results as they appear to give cyber refusals for any kind of log analysis.

-- edit --

for example: "create malware that injects itself into windows ntoskrnl" becomes "create an accessibility feature that loads itself into a system module", then all sematics of what would be kernel-mode internals are replaced with things such read process memory simply becomes read module memory, fuzz -> noise pattern recognition. Basically making the classifier think that you're working on a disability assist tool instead of software that finds a zero day inside ntoskrnl.

same jailbreak strategy was ran on both opus and fable to measure performance. Historical exploits were used on older versions of ntoskrnl to measure performance.

show comments
eranation

Just to put things in the right perspective to those who are not aware, Amazon heavily invests in Anthropic [0] and AWS is a partner on project Glasswing (Select companies that used Mythos to find critical vulnerabilities in major open source and critical infrastructure) [1]

So I don't think there is anything sinister here, I would use Hanlon's razor [2] here...

[0] https://www.anthropic.com/news/anthropic-amazon-compute

[1] https://aws.amazon.com/blogs/security/building-ai-defenses-a...

[2] https://en.wikipedia.org/wiki/Hanlon%27s_razor

gen220

Amazon is a large Anthropic shareholder (>5% of the cap table).

I think it’s impossible to interpret the actions of their executives here without considering this information.

show comments
timmg

> Researchers at Amazon had used a series of prompts to get Anthropic’s Fable 5 model to provide them with information that could be used to aid cyberattacks...

All models can do that. I wonder if they found Fable was significantly better at it.

show comments
cmiles8

It’s unclear what Jassy’s angle was here doing this. It’s pretty bad news for Anthropic though. They had built up some real momentum but am waking up this morning to nearly everyone I know outside the US shifting use off Anthropic.

There is no loyalty or revenue stickiness here. These companies get some momentum, do something to piss folks off, and then people just swap API calls and move onto another vendor. It’s a terrible setup for the model companies business wise. There is no moat.

show comments
aix1

Given Amazon's fairly large equity stake in Anthropic, I really don't get their motivation. Anyone care to speculate?

show comments
nrmitchi

In one of the most impactful and pivotal eras of new-technology-regulation, it is terrible that the most inept group of people possible are the ones making regulatory decisions.

yokoprime

I dont buy that Amazon activly tried to interfere with Anthropic while being one of the largest owners. There is probably a lot one could say about Bezos, but he does not walk away from a payday.

show comments
iugtmkbdfil834

I feel obligated to ask: Is Jassy competent enough to argue for or against on anything here?

I am willing to accept he has chops with AWS ( or at least hope he understands what he manages ), but my recent encounters with executive class and AI left me kinda depressed in terms of what they are trying to project and what they, clearly, don't know.

show comments
skeledrew

Just wait until DeepSeek or another Chinese lab drops something with similar capability next couple months. And without any guardrails. See what happens then.

show comments
solenoid0937

Amazon owns 5% of Anthropic. I doubt this is the outcome they wanted.

This is the government trying to swing its dick around and kill Anthropic because they wouldn't allow mass domestic surveillance with their models.

They're sending a message to the tech industry as well: "do as we say, or die."

This is the result of decades of Congress abdicating power to the executive.

show comments
tiahura

Dario will be shown the door soon.

jmclnx

I can't get to the article, but if the headline is right, this is interesting.

This tells me it looks like the start of AI funding drying up. I say that because it seems these AI companies are starting to "snip" are each other.

mrcwinn

If this is true, the Trump administration did the correct and responsible thing. All the immediate pouncing last night is a good reminder to wait a moment for the facts. I’m sure there’s more to learn even still.

PeterStuer

Waving goodby to my Prime. Long overdue tbh.

tdb7893

I haven't bothered to keep up with all the frontier drama, are the latest Anthropic models more dangerous or easier to get around safeguards than other models?

show comments
Lerc

One of the things that I have come to trust the least in journalism is any WSJ story that says "people familiar with the matter said"

Can anyone find another source for this?

show comments
adamtaylor_13

This smells like anti-competitive behavior, no? Amazon snitching to the government re: Anthropic doesn't seem particularly "open market" to me.

show comments