The week AI became too dangerous to ship freely • Stephen Van Tran

Two labs, one week, and the same unsettling verdict

Seven days. Two announcements. One conclusion that neither company stated plainly but that every security professional reading between the lines understood immediately: frontier AI has crossed a threshold where its most capable models can break into almost anything, faster than any human team can fix it, and the labs that built those models have decided the only responsible response is to lock them up.

Anthropic moved first. On April 7, the company announced Project Glasswing, a controlled deployment of Claude Mythos Preview — its most capable model, one Anthropic has been explicit about never releasing publicly — to a hand-selected cohort of roughly 40 organizations responsible for critical software infrastructure. The launch partners read like a cybersecurity who’s-who: Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic committed $100 million in usage credits plus $4 million in direct donations to open-source security organizations. Seven days later, OpenAI answered: the April 14 release of GPT-5.4-Cyber, a dedicated cybersecurity model available exclusively to verified participants in its Trusted Access for Cyber program, backed by $10 million in API credits and guarded by stringent Know-Your-Customer identity verification.

The symmetry is not coincidental. Both companies independently arrived at the same conclusion: the models they are building have become so capable at finding and exploiting software vulnerabilities that releasing them without controls would be the equivalent of handing a master key to every lock on the internet to anyone who asks. The models are not equally capable — Anthropic’s Mythos Preview remains more restricted precisely because it is more dangerous — but the underlying logic is identical. These are weapons as much as tools, and the decision about who gets to use them is no longer just a business question. It is a security policy with national infrastructure implications.

The stakes materialized in a specific and alarming disclosure. According to Anthropic’s own technical report, Claude Mythos Preview has already identified thousands of zero-day vulnerabilities across every major operating system and every major web browser. Among those finds: a 17-year-old remote code execution vulnerability in FreeBSD — catalogued as CVE-2026-4747 — that Mythos discovered and then fully autonomously exploited, achieving complete server control without human assistance. This is not a benchmark score or a lab demonstration. It is a documented instance of an AI system independently compromising a production-grade target that human security researchers had missed for nearly two decades. The implication for offensive actors who might obtain equivalent capability is difficult to overstate and impossible to dismiss.

The cybersecurity industry spent years warning about this moment in forecast documents and conference talks. What changed in April 2026 is that the warning became a product launch — and then a competing product launch one week later.

Forty companies, a 17-year-old bug, and $104 million to fix the rest

The structural difference between Anthropic’s and OpenAI’s approaches reveals a genuine philosophical split in how to manage AI capability that has outpaced the defenses available to stop it.

Anthropic’s Project Glasswing model is deliberately narrow. The roughly 40 participating organizations are not customers in any conventional sense — they are custodians whose software sits at the foundation of global digital infrastructure. The Linux Foundation maintains the kernel running most of the world’s servers. Cisco builds the routers carrying the internet’s traffic. Palo Alto Networks deploys security tooling across tens of thousands of enterprise endpoints. Giving these organizations access to an AI that can autonomously identify and exploit vulnerabilities serves a specific strategic purpose: patch the most consequential software before bad actors discover the same flaws through other means. Anthropic frames this as using AI to “secure critical software for the AI era” — which is technically accurate and tactically convenient, since it lets Anthropic deploy Mythos Preview in a controlled environment while gathering real-world performance data impossible to collect otherwise.

The FreeBSD CVE case is instructive about the velocity advantage this creates. A 17-year-old vulnerability is not an edge case in the software security world — it is a routine artifact of how complex systems accumulate technical debt. Human security researchers have finite time and prioritize known attack surfaces. Mythos Preview, operating autonomously, has no such constraint. It can scan codebases that have not received security attention in years, apply its understanding of exploit patterns across language families, and identify the intersection of obscure code paths that produces an exploitable condition. The AI does not get tired, does not have a limited research budget, and does not need to sleep between sprints. The competitive advantage it confers is fundamentally about throughput: the ability to examine vastly more code, in vastly more depth, in vastly less time than any human team can match.

OpenAI’s approach is architecturally different and philosophically more expansive. GPT-5.4-Cyber is not Mythos Preview. It is a fine-tuned variant of GPT-5.4 with lower refusal thresholds for legitimate cybersecurity work and new capabilities including binary reverse engineering — the ability to analyze compiled software for vulnerabilities without access to source code. This matters enormously in practice, because the vast majority of real-world vulnerability research happens against compiled binaries, not readable source. Binary RE is a specialized skill that takes years to develop; deploying it as a model capability compresses that expertise barrier to the cost of an API call. While Anthropic limits Mythos Preview to roughly 40 institutional partners, OpenAI is scaling its Trusted Access for Cyber program to thousands of verified individual defenders and hundreds of teams responsible for securing critical software, with plans to expand further in coming months.

The access models reflect different theories of how defensive advantages compound. Anthropic’s thesis is concentration: a small number of organizations with enormous surface area and maximum institutional accountability can use Mythos Preview to generate the highest return per model use, since patching a vulnerability in the Linux kernel protects every system running Linux simultaneously. OpenAI’s thesis is distribution: a large number of verified individual defenders working across the full diversity of the software stack will collectively find more vulnerabilities than any constrained cohort, even if each individual patch is smaller. Both theories are defensible. Neither has been validated at scale, because no AI system has ever been deployed for security work at this level of capability until this month.

The background context makes both programs feel simultaneously essential and insufficient. Hadrian’s research team cataloged 70 open-source AI penetration testing tools as of March 2026 — up from fewer than five before GPT-4’s release in April 2023. Sixty-five AI-powered offensive security tools appeared in 18 months. The rate of tooling development on the offensive side is not slowing; it is compounding. What Anthropic and OpenAI have announced as “trusted access” programs for defenders exists against a backdrop in which attackers have already absorbed earlier capability generations and are deploying them against production targets. IBM announced new cybersecurity measures in mid-April specifically designed to address “agentic attacks” — a category that did not exist as an enterprise product concern eighteen months ago.

Here is the quantified insight that requires stitching together multiple sources to see clearly: Anthropic’s $100 million in Glasswing credits plus OpenAI’s $10 million in TAC credits equals $110 million in AI compute committed to defensive cybersecurity. The global cybercrime market caused an estimated $9.5 trillion in damage in 2024. The funding ratio is roughly 1:86 ,000. The defenders are not losing because they lack commitment. They are losing because attackers have better economics. An AI model compressing weeks of vulnerability research into hours eliminates the human labor cost that was the primary friction in offensive operations. The Glasswing and TAC programs represent a genuine shift in defensive tooling, but they are fighting an economic asymmetry that $110 million in credits cannot resolve alone.

The airlocks will not hold

The restricted-release architecture that both Anthropic and OpenAI have adopted rests on assumptions that deserve hard scrutiny, because if any of them break, the entire safety framework collapses.

The first assumption is that identity verification works at scale. OpenAI’s TAC program requires Know-Your-Customer checks, identity verification, and institutional affiliation confirmation. This is meaningful friction. But security researchers have spent three decades demonstrating that any access control system with a wide enough aperture develops exploitable gaps. Scaling TAC to thousands of individual users multiplies the attack surface for social engineering, credential theft, and insider misuse. The history of sensitive capability programs leaking through their access controls — from classified government tools to premium commercial zero-day brokers — does not encourage optimism. The Register noted in April that Anthropic’s Project Glasswing CVE count remains largely unverified by independent parties, raising questions about whether the vulnerability discovery claims are as dramatic as press materials suggest. That skepticism does not undermine the program’s value, but it matters for calibrating expectations: the actual impact of these deployments will not be known for months, and the restricted-release architecture makes independent verification difficult by design.

The second assumption is that adversaries do not already have equivalent capabilities. This is the hardest premise to defend. The 70 open-source AI pen-testing tools cataloged by Hadrian are publicly available. Nation-state actors with dedicated AI research programs — China, Russia, Iran, North Korea — have resources and incentives that make it implausible they are waiting for OpenAI or Anthropic to ship them models. The more relevant question is not whether adversaries will eventually obtain frontier cyber AI, but whether defenders get a meaningful head start. The window between Anthropic’s April 7 announcement and whatever analogous capability an adversary nation develops is measured in months, not years. NBC News reported that security experts fear AI could permanently tip the scales toward hackers — not because defenders are not trying, but because the asymmetry between attacking and defending has always favored attackers, and AI multiplies that asymmetry. Defenders using Glasswing today have a time-limited advantage that requires velocity to convert into durable security improvements. Patching a zero-day that Mythos discovered this week is a win. Patching the next thousand zero-days before a state-sponsored AI finds them first is the actual contest.

The third assumption is that the restricted models are safe from exfiltration. Model weights are files. Files can be stolen. The March 2026 leak of details about Claude Mythos — reported in late March — demonstrated that even sensitive information about frontier AI models can escape the lab before companies intend it. A full model weights exfiltration would be categorically more dangerous than a capability description leak, giving adversaries direct access to the model rather than knowledge of its existence. The security protocols protecting those weights are presumably excellent. They are also the highest-value targets in the world for state-sponsored intelligence operations. The precedent from the U.S. semiconductor industry, where export control regulations could not prevent the diffusion of advanced chip design knowledge to Chinese firms, suggests that containment strategies have a poor historical track record against determined adversaries with long time horizons.

The fourth and most troubling assumption is that restricted releases scale to the full extent of the problem. Claude Mythos Preview found thousands of zero-days in critical software in weeks. If a model of this capability had been run against the complete landscape of critical open-source software six months earlier, how many vulnerabilities would it have found? How many of those vulnerabilities are currently being exploited by actors who discovered them through other means? The restricted-access program addresses the future. It cannot retroactively secure the past. Cybersecurity Dive reported that autonomous AI attacks already ushered cybercrime into the AI era in 2025: ransomware pipelines now run without human operators, targeting selection happens automatically, and the time from initial access to data exfiltration has compressed from days to hours. Glasswing and TAC are responses to a threat that may already be materially ahead of the defense.

Security researcher Bruce Schneier argued on his blog that Project Glasswing “sounds necessary” but carries implications Anthropic has not fully addressed publicly: what happens when equivalent model capability is available to governments using it for offensive operations under an “all lawful purposes” framing? The contract language OpenAI used for its Pentagon deal in February 2026 permits the U.S. government to deploy AI for any lawful purpose — a carveout that is not a restriction but permission at the highest level of consequence. The Nextgov reporting on Glasswing’s implications for U.S. cyber operations notes that the initiative explicitly invites federal partners without addressing how the same model capability might be used in offensive contexts by government customers. A defensive tool and an offensive weapon are often the same artifact with a different mission order.

Before the next model finds your flaw first

The defensive posture available to security organizations right now is more actionable than the macro analysis suggests, provided those organizations move faster than the adversaries building equivalent capabilities outside the controlled-access programs.

The strategic picture for the next 24 months is clear even if it is uncomfortable. OpenAI’s GPT-5.4-Cyber and Anthropic’s Mythos Preview are early-production versions of a technology that will be partially democratized within that window, regardless of how carefully the frontier labs manage access today. Earlier generations — the ones behind those 70 open-source pen-testing tools — are already with attackers. The current restricted models are the next-generation advantage available to defenders who engage now. That advantage has a shelf life. The playbook is not to trust the airlocks indefinitely. It is to use the head start those airlocks provide while they still function.

The data on what that head start looks like is compelling. According to the Ethiack analysis of AI hacking in early 2026, AI agents now outperform humans at discovering vulnerabilities on public bug bounty programs — with autonomous AI reporting agents reaching the top of HackerOne leaderboards in 2025. The SecurityWeek Cyber Insights 2026 report found that 48 percent of cybersecurity professionals believe agentic AI will be the top attack vector by the end of 2026. Meanwhile, only 29 percent of organizations feel ready to defend against agentic attacks, even as 83 percent plan to deploy agentic capabilities of their own. The gap between deployment intentions and defensive readiness is a structural vulnerability, and it is widening faster than most organizations’ security programs can compensate.

For security professionals and operators navigating this transition, the priority checklist is concrete:

Apply for Project Glasswing or TAC access immediately if your organization qualifies. Glasswing targets organizations that build or maintain critical software infrastructure — if your team maintains open-source libraries, operating system components, browser code, or foundational network infrastructure, you likely meet the threshold. OpenAI’s TAC program reaches thousands of individual researchers; the barrier is identity verification, not institutional prestige. The combined $110 million in committed compute represents free vulnerability discovery capacity that your security budget did not have to fund. The only cost is the time to apply and the organizational will to integrate AI-generated findings into your remediation pipeline.
Restructure your triage process for AI-scale CVE volume. Mythos Preview identified thousands of vulnerabilities in weeks across a curated set of critical software. Your existing triage stack — manual review, CVSS prioritization, backlog grooming — was designed for a world where human researchers produced dozens of findings per quarter, not thousands. The bottleneck in an AI-augmented security program is no longer discovery. It is the human judgment required to assess which AI-generated findings represent genuine exploit paths versus theoretical vulnerabilities that CVSS scores poorly. Invest in that review infrastructure before the volume of AI-discovered flaws overwhelms your remediation capacity.
Calibrate your threat model to a 12-to-18-month adversary lag. The 70 open-source AI offensive tools available as of March 2026 were built on model generations 12 to 18 months behind the frontier. Defenders using Mythos Preview and GPT-5.4-Cyber today have a real lead, but it is measured in months, not years. Every patch cycle matters. Every vulnerability that Glasswing finds and fixes this quarter is one fewer entry point available to an adversary developing equivalent capability in 2027.
Shift your defensive architecture toward agentic response. Traditional perimeter defenses and signature-based detection are insufficient against agents that can rewrite their own attack patterns in real time. IBM’s April 2026 cybersecurity announcement explicitly addressed this gap. Deploying defensive agentic AI — systems that can monitor, respond, and pivot autonomously — is no longer a speculative roadmap item. It is the minimum viable response to an adversary that no longer requires human operators to execute complex multi-stage attacks. The organizations that deploy agentic defense in 2026 will be materially more resilient than those that do it in 2028, because 2027 is when the current frontier capability will be open-source.
Engage the governance conversation before it is decided without you. The “all lawful purposes” framing in AI contracts, the shape of access controls for the next generation of restricted models, and the regulatory frameworks being developed at state level and in Washington will determine whether the defensive advantage that Glasswing and TAC represent is durable or temporary. Security practitioners who understand the technical reality of what these models can do are the most credible voices in those policy debates. The people writing the access rules right now mostly do not have your operational context.

The most honest framing of this moment is not reassuring: the restricted-access model Anthropic and OpenAI have built is a triage mechanism, not a solution. It buys time for defenders to patch the most critical vulnerabilities before attackers with equivalent capability appear. It does not close the underlying gap between the attack surface that complex software creates and the human capacity to secure it. Only AI-scale automation of defensive security — applied as broadly and with as much urgency as the offensive tools being built in parallel — can address that gap at the level the threat requires.

What Project Glasswing and the Trusted Access for Cyber program represent, at minimum, is an institutional commitment to that automation from the two organizations most capable of delivering it. Whether that commitment scales fast enough is the question April 2026 has put on the table. The models have already demonstrated they are capable of answering it. The question is whether the organizations responsible for the world’s software can absorb the findings fast enough to matter.

In other news

Anthropic ships Claude Opus 4.7 — Anthropic released Claude Opus 4.7 on April 16, its strongest publicly available model, with a new “xhigh” reasoning effort level, tripled image resolution support, and coding improvements that VentureBeat reported narrowly reclaim Anthropic’s lead on frontier model benchmarks. The model ships at the same price as Opus 4.6 — an unusual move for a flagship upgrade, and one that signals Anthropic is prioritizing adoption velocity over margin expansion heading into a rumored IPO process.

Google negotiates classified Gemini Pentagon deployment — Alphabet is in discussions with the U.S. Department of Defense to run Gemini models in classified environments, according to reporting by The Information. The proposed contract terms include guardrails against domestic mass surveillance and autonomous weapons targeting without human oversight — language similar to OpenAI’s February 2026 “All Lawful Purposes” Pentagon agreement. The deal would mark a significant reversal from Google’s prior public stance on military AI contracts after it withdrew from Project Maven in 2018.

Anthropic rebuffs $800 billion valuation offers, advances IPO — Multiple venture firms offered to invest in Anthropic at valuations exceeding $800 billion — more than double its $350 billion February 2026 valuation — as Anthropic’s annualized revenue crossed $30 billion in early April. The company is advancing IPO discussions with Goldman Sachs, JPMorgan, and Morgan Stanley, with OpenAI itself surpassing $25 billion in annualized revenue and taking parallel steps toward a public listing potentially as soon as late 2026.

MIT Technology Review charts the AI state of play — MIT Technology Review published a comprehensive visual analysis of where AI stands heading into mid-2026, covering benchmark trajectories, enterprise adoption rates, and the narrowing gap between US and Chinese frontier model performance — a convergence the Stanford AI Index called the most significant competitive shift since 2023.