The NSA's Mythos Moment Forces a Reckoning on AI Release Timelines

According to accounts relayed by Senator Mark Warner of Virginia, the director of the National Security Agency told Congress that Claude Mythos, Anthropic's cybersecurity-focused frontier model, penetrated nearly all of the NSA's classified systems in a matter of hours. The claim has sparked both alarm and skepticism across the security community, with some experts dismissing it as hype and others warning it explains the Trump administration's aggressive response to the model.

The account, if accurate, would represent one of the most consequential capability demonstrations in AI history. It would also explain why the Commerce Department moved last week to classify Mythos and its consumer-facing variant, Fable 5, as export-controlled cyber weapons, forcing Anthropic to disable access for all users worldwide.

A Model Built to Break Things

Mythos Preview was introduced in April 2026 as a frontier model designed to find and exploit software vulnerabilities. According to Anthropic's own red team assessment, the model can identify and exploit zero-day vulnerabilities in every major operating system and every major web browser. One example cited by the company: a 17-year-old remote code execution flaw in FreeBSD that Mythos discovered and exploited fully autonomously after a single prompt.

Anthropic has embedded approximately six engineers inside the NSA to help the agency customize and deploy Mythos for what reports describe as offensive cyber operations. The arrangement exists despite the Pentagon labeling Anthropic a "supply chain risk" in March 2026 and barring its products from federal agencies. The NSA secured a carve-out to continue using the model anyway.

CrowdStrike's 2026 threat research reportedly found adversary use of AI up 89% year over year. The logic driving the NSA's adoption of Mythos, according to sources close to the arrangement, is simple arms-race calculus: if U.S. adversaries will eventually deploy similar capabilities, America needs them first.

The Jailbreak Problem Has No Solution

The immediate trigger for Commerce's export control directive was a reported jailbreak of Fable 5, the safeguarded version of Mythos that Anthropic released to the public on June 9. Amazon researchers reportedly found a way to bypass Fable 5's guardrails to access the underlying Mythos cyber capabilities. Anthropic disputes the severity, calling it a "narrow, non-universal jailbreak" and noting that similar techniques work on other frontier models, including OpenAI's GPT-5.5.

But security experts say the jailbreak debate misses the point. Martin Riley, CTO at cybersecurity firm Bridewell, put it bluntly: "You cannot guarantee that a model will remain jailbreak-proof forever. Anyone promising that is selling something."

The challenge is fundamental. Traditional software bugs can be patched because inputs are finite and defined. AI models are designed to understand natural language, which is effectively infinite. Attackers can use role-play scenarios, multi-step prompting, encoded instructions, and techniques no one has thought of yet. The input space cannot be enumerated, so the absence of attack paths cannot be proven.

This creates an impossible regulatory demand. If every narrow jailbreak triggers an export control, as Anthropic warned in its response, "it would essentially halt all new model deployments for all frontier model providers."

The Pre-Release Hardening Dilemma

The Mythos episode establishes a difficult precedent. If AI models can autonomously penetrate hardened government systems in hours, then every system those models might target needs to be hardened before the model is released. Or the model cannot be released at all.

Anthropic recognized this dilemma early. The company formed Project Glasswing specifically to use Mythos defensively, giving partners like AWS, Microsoft, Google, and Palo Alto Networks access to find and fix vulnerabilities before offensive actors could exploit them. The project has expanded to roughly 150 organizations in over 15 countries and has identified more than 23,000 potential vulnerabilities across over 1,000 open source projects.

But patching the world's software is not a task that completes on a release schedule. Project Glasswing participants have reportedly found over 10,000 serious flaws, with over 1,000 rated high or critical severity. The vast majority remain unpatched.

Open Source May Offer a Path Forward

There is a counterintuitive argument gaining traction among researchers: open source AI development may actually produce safer outcomes than the closed-source model Anthropic and others have pursued.

The logic runs as follows. Closed models concentrate power and create single points of failure. If one company's safeguards fail, or one government demands access, the entire system is compromised. Open models distribute risk. They allow researchers everywhere to study vulnerabilities, test defenses, and develop countermeasures without depending on any single organization's judgment.

Academic research supports this view. Open datasets can be analyzed for toxic content. Open model internals can be studied for alignment failures. Open safeguards like Meta's Llama Guard can be improved by anyone. The transparency that makes open models theoretically more dangerous to release also makes them easier to secure over time.

The International AI Safety Report acknowledged this tension, noting that greater AI openness facilitates innovation, improves safety and oversight, and allows tools to be tailored to diverse needs. The cost of training frontier models now exceeds $100 million and is projected to surpass $1 billion by 2027. Open weights democratize access to capabilities that would otherwise concentrate in a handful of well-funded labs and their government partners.

None of this resolves the immediate crisis. Mythos exists. Its capabilities are real, whether or not the NSA breach claim holds up to scrutiny. And the safeguards Anthropic developed were bypassed within days of public release.

The policy apparatus is, as Dario Amodei wrote in a blog post last week, "slow and rickety." He called for mandatory third-party testing of model risks in cybersecurity, biological weapons, loss of control, and automated R&D; that could accelerate other risks. The Trump administration's response so far has been to use export controls as a blunt instrument, forcing Anthropic to disable its most capable models for everyone because it could not guarantee they would stay out of the wrong hands.

The next generation of frontier models is already in development. The question is no longer whether AI can break things. The question is whether we can build institutions capable of managing systems that improve faster than we can secure them.

The NSA's Mythos Moment Forces a Reckoning on AI Release Timelines

A Model Built to Break Things

The Jailbreak Problem Has No Solution

The Pre-Release Hardening Dilemma

Open Source May Offer a Path Forward

Related Stories

Stay ahead of the signal