Photo by Patrick Martin on Unsplash
Google catches the first AI-built zero-day in the wild
/ 16 min read
Table of Contents
The line moved on Monday, and most CISOs missed it
For two years, the cybersecurity industry argued in conference panels about whether large language models would ever produce a usable zero-day on their own. On May 11, Google’s Threat Intelligence Group published a report that ended the argument. In a single document, GTIG disclosed that it had identified — and disrupted before deployment — the first documented case of a criminal threat actor using an AI model to discover and weaponize a previously unknown vulnerability in widely deployed software, then turning it into a Python exploit aimed at a mass-exploitation campaign. The vulnerability was a two-factor-authentication bypass in a popular open-source web-based system administration tool, and the operators were one step away from pushing it across thousands of internet-exposed instances. Google’s Cloud blog post on adversaries leveraging AI for vulnerability exploitation is the primary source, and the framing matters: GTIG explicitly called this “the first malicious AI use” for a zero-day, and the qualifier “first” is doing heavy work because it implies a long second column to come.
The news cycle around the report compressed the story into a one-liner — AI wrote a zero-day, Google caught it — but the operational implications are not one-liner-sized. As CNBC reported the same day, the criminal group’s plan was a synchronized mass-exploitation event against unpatched targets, the kind of operation that historically requires either a nation-state-grade bug broker or a months-long manual vulnerability hunt. With an LLM in the loop, the hunt collapsed into something closer to a directed search, and the weaponization step took hours rather than weeks. Fortune’s coverage stitched the story to a broader threat picture, framing the GTIG findings as evidence that the long-feared lower bound on offensive AI cost has finally been breached for real-world targets, not academic benchmarks. The defenders did win this round. The economics of the next round look different.
The stakes for any organization running open-source administrative software — which is virtually every enterprise — are immediate. The specific bug GTIG described arose from a high-level semantic logic flaw, a hard-coded trust assumption inside an authentication path. That is precisely the class of vulnerability that traditional fuzzers and static analyzers struggle with, because the code “works” but its assumptions break under a creative read. LLMs are good at creative reads of intent, which is why the bug was found, and they are also good at writing tidy Python that bypasses the assumption once spotted. The Hacker News writeup of the GTIG report noted that the exploit script carried unmistakable LLM fingerprints — educational docstrings, hallucinated CVSS scoring sections, textbook-style PEP 8 formatting — that the GTIG analysts treated as strong evidence the code was machine-generated rather than human-authored. Those telltales will get scrubbed by professional operators within a quarter. The technique will not.
GTIG’s framing places this on a continuum with the rest of the May 2026 threat data, not as an isolated incident. The same report cataloged active AI-augmented malware families, autonomous Android backdoors using the Gemini API at runtime, and supply-chain compromises specifically aimed at AI development infrastructure. Infosecurity Magazine characterized the cumulative effect bluntly — AI-assisted hacking is no longer a 2027 risk, it is a Q2 2026 operational reality, and the gap between offensive demonstration and defensive readiness widened in the past 90 days. The bug is fixed. The capability is loose. The question this week is no longer whether AI can produce a zero-day. The question is what an enterprise security program looks like when that capability is not rare.
Inside the playbook: PROMPTFLUX, PROMPTSPY, and the new malware factory
Read the GTIG report past the headline and a different story emerges: not one bug, but an entire production line. The AI-built zero-day is the most cinematic finding, but the more durable threat is the catalog of AI-enabled malware families that GTIG documented in the same release. The standouts — PROMPTFLUX, HONESTCUE, CANFAIL, and LONGSTREAM — are all Russia-nexus families that use LLM calls at runtime, not just at compile time, to mutate their own behavior and disguise their intent from static analyzers. SecurityWeek’s coverage of the GTIG findings summarized the pattern: PROMPTFLUX uses the Gemini API to dynamically rewrite portions of its own code at execution time, while HONESTCUE and CANFAIL embed LLM-generated decoy logic to fool sandboxes and reviewers into classifying the binary as benign developer experimentation. LONGSTREAM, the most darkly funny example, contains thirty-two instances of repetitive daylight-saving-time queries inserted by an LLM as filler to populate the sample with innocuous-looking activity. These are not toy proofs. They are samples GTIG pulled out of live operations against Ukrainian targets.
The Android side of the catalog is where the autonomy story gets serious. GTIG’s writeup describes PROMPTSPY, an Android backdoor whose autonomous module, GeminiAutomationAgent, calls the gemini-2.5-flash-lite endpoint at runtime to read the device’s accessibility tree, plan a sequence of taps and swipes, and execute them without an operator in the loop. Tom’s Hardware called the combination “self-morphing malware and Gemini-powered backdoors” and framed PROMPTSPY as evidence that an autonomous-malware era is now operational rather than theoretical. The technical detail that should jar enterprise security architects is that PROMPTSPY’s command-and-control infrastructure and Gemini API keys are runtime-updateable, meaning the malware family can switch model endpoints or pivot to a new credential pool faster than a typical incident-response cycle can map its current behavior. The traditional indicator-of-compromise model — block the hash, block the domain, hunt the hash again next week — is structurally outmatched by a sample whose behavior is regenerated on demand.
The state-aligned activity GTIG documented is the third leg of the stool. The report named several China-nexus operators using Gemini for reconnaissance and exploit development at scale: UNC2814 used “expert persona” jailbreaks to coax the model into helping research firmware in TP-Link devices and OFTP file-transfer implementations; APT45 issued thousands of recursive prompts validating proof-of-concept exploits against catalogued CVEs; APT27 used Gemini to develop tooling for managing operational relay box networks. CyberScoop’s writeup of the GTIG findings emphasized that these are not novel jailbreaks — they are old social-engineering patterns applied at industrial scale to free-tier model accounts, and the volume is the story. Each individual prompt looks defensible. The pattern across a thousand prompts is reconnaissance work that would have taken a human researcher a month, compressed into a single afternoon of automated querying.
The fourth leg is the supply chain. GTIG attributed a cluster of compromised PyPI packages and GitHub repositories — including the AI gateway utility LiteLLM, the security scanner Trivy, and other dependencies central to enterprise AI development workflows — to a cybercrime crew it tracks as TeamPCP, also known as UNC6780. The payload, a credential stealer named SANDCLOCK, exfiltrated AWS keys, GitHub tokens, and AI API secrets, and the operators monetized the captured access by selling onward to ransomware affiliates. The Register’s account of the report flagged the supply-chain angle as the most underdiscussed risk in the cycle, because every enterprise that has pulled a LiteLLM update in the past sixty days is potentially carrying a foothold into its own AI infrastructure. The compromise of an AI gateway is a particularly nasty primitive: the gateway sees every prompt, every output, and every API key the organization routes through it.
The strongest proprietary inference from stacking these data points is structural. GTIG’s report is not the story of one criminal group breaking through; it is the story of three distinct adversary ecosystems — Russia-nexus cybercrime, China-nexus espionage, opportunistic supply-chain crews — each finding independent product-market fit with the same family of LLM tooling within an overlapping six-month window. That kind of convergence is not a research curiosity. It is the empirical signal that the cost-of-entry barrier for AI-assisted offensive operations has dropped below the threshold at which only a handful of actors can clear it. PYMNTS’ breakdown of the GTIG findings put a clean number on the velocity: GTIG saw a roughly fourfold increase in distinct AI-augmented malware families across the six months between its February and May 2026 reports. At that compounding rate, the threat surface in November is unrecognizable from the threat surface in February.
What might not be as scary as it sounds
The bull case for defenders is that this exploit was caught — and caught early enough that it never produced a single confirmed victim. That is not a small detail. The GTIG team identified the operation through telemetry, coordinated responsible disclosure with the affected vendor, and pulled the operator’s Gemini account before the mass-exploitation phase. The defender stack that produced this win combined classical threat intelligence with two new AI agents that have been quietly compounding throughout the past year. Google DeepMind’s CodeMender announcement from October 2025 described an autonomous code-security agent that, in the six months between its launch and the May 2026 report, upstreamed 72 security fixes to open source projects, including some as large as four million lines of code. Big Sleep, the sister agent focused on zero-day discovery, was credited with finding a critical SQLite vulnerability before threat actors could weaponize it. The offensive side got faster. The defensive side is also accelerating, and Google is one of a handful of defenders with the model-API telemetry to spot a malicious user mid-operation.
The second counterpoint is that “first” claims about AI-built exploits deserve skepticism, because the same exploit could have been generated by a sufficiently determined human researcher. The bug class — a 2FA bypass arising from a hard-coded trust assumption — is the kind of finding a careful manual code review catches every week, and there is no public test that proves GTIG’s specific exploit could not have been written by a human typing slowly. What GTIG can say with high confidence is that the exploit script carried LLM fingerprints; it cannot rule out that the operator used an LLM to format and polish a vulnerability they had identified themselves. BleepingComputer’s report on the same incident carried that note of caution and quoted security researchers warning against treating the “first AI zero-day” framing as a binary line. The honest reading is that the boundary between human-augmented and AI-augmented exploitation is not crisp, and the GTIG telemetry simply caught one operator using enough AI in enough places to be detectable. Detection bias may make the trend look sharper than it is.
The third counterpoint is the most important to address directly. The GTIG report did not show an LLM autonomously discovering a vulnerability, deciding to exploit it, and operating end-to-end without human direction. There was a human in the loop — a criminal operator iterating prompts, evaluating outputs, and shipping the working artifact. The capability gap between “an LLM can help a skilled operator move faster” and “an LLM can independently run an exploitation campaign” is still meaningful, and the autonomous-malware demonstrations from PROMPTSPY are scoped to post-compromise execution rather than initial-access decision-making. The NBC News writeup carried Google’s own qualification: today’s offensive AI is a productivity amplifier, not an autonomous adversary. That distinction matters operationally, because the existing playbook for hunting human-led intrusions still applies — it just needs to assume the human is now ten times more productive than the same human in 2024.
The bear rebuttal is that these counterpoints describe the situation as of May 2026, and the relevant question for a CISO planning a 2027 program is the trajectory rather than the snapshot. The bug class will widen. The fingerprints will get scrubbed. The autonomous loops will close. The defender AI agents will continue to improve, but the offensive side has structural advantages — model fine-tuning, prompt engineering, and operator iteration are all easier than building enterprise-grade detection pipelines — and the cost asymmetry will widen for at least another twelve to eighteen months. the AI security budget data captured in Palo Alto Networks’ 2026 cybersecurity predictions put a number on the gap: thirty percent of enterprises now report a dedicated AI security budget, up from twenty percent a year earlier, while only six percent of organizations describe their AI security strategy as “advanced.” The bear case is that the speed of attacker AI adoption is outrunning the speed of defender AI maturity, and the GTIG report is the first quantitative evidence that the gap is producing real-world incidents rather than theoretical risk reports.
How to prepare before the next exploitation wave hits
The most likely scenario over the next six months is a normalization of AI-built exploitation operations across the mid-tier criminal landscape. The GTIG report described an attempted mass-exploitation event that was caught at the operator-account level by a model provider with deep telemetry into its own platform. Replication is straightforward, and several mitigations that would have made replication harder — provider-side abuse detection, account-level rate limits on adversarial query patterns, mandatory enterprise-tier verification for high-volume API access — are not yet universal across model platforms. The CAISI pre-deployment framework for frontier models, which I covered in the May 7 piece on the Center for AI Standards and Innovation’s expanded testing program, is part of the response, but pre-deployment testing operates on the model rather than on the operator. Account-level abuse monitoring is the gap, and the GTIG report is implicit pressure on every major model provider to publish their own version of the data Google just published.
The regulatory consequences will start arriving on a quarter-by-quarter timeline. The Pennsylvania attorney general’s case against Character.AI, which I unpacked in the May 9 post on AI-as-medical-practice liability, established that state-level enforcement will treat AI products as in scope for traditional regulatory frameworks. Cybersecurity regulators have a parallel doctrine — under existing software-liability law, a vendor that ships a known-vulnerable product or fails to respond to coordinated disclosure faces enforcement risk, and there is no legal exemption when the vulnerability discovery was AI-assisted. The Stanford AI Index 2026 governance commentary I reviewed in late April flagged that the gap between AI capability growth and governance maturity is widest in the cybersecurity domain. The GTIG findings will accelerate the regulatory cycle, and CISOs should expect new disclosure obligations around AI-assisted incidents, AI-related supply chain compromises, and adversary use of enterprise-purchased models.
The second-order effect to plan for is the realignment of the AI gateway as a security choke point. The LiteLLM compromise inside the GTIG report is not a one-off — it is a structural warning that any organization that has consolidated its model access through a single gateway has consolidated its risk through that gateway. The same architecture that gives a security team prompt logging, cost control, and policy enforcement also gives an attacker, if the gateway is compromised, a perfect vantage point for credential harvesting and prompt-injection persistence. Enterprises that have moved aggressively into AI gateways through 2025 — and the McKinsey research on agentic enterprise security architectures suggests most Fortune 500 organizations have — need to apply the same hardening posture to the gateway that they apply to identity providers. The gateway is now an identity-adjacent control surface, not a developer convenience.
A third dynamic to watch is the deflation in the price of zero-days on private markets. Until the GTIG report, broker-mediated zero-days for widely deployed open-source software traded in the high six-figure to low seven-figure range, depending on category and target. The GTIG case shows that an LLM plus an operator can produce a comparable artifact at near-zero marginal cost, and the broker market will adjust within months. The implication for defenders is a higher background rate of unsophisticated but functional zero-day deployment by lower-tier actors who could not previously afford the brokered route. Threat models that assumed sophisticated actors as the only sources of unknown-vulnerability risk will need rewriting in light of that flattened cost curve.
Operator checklist for CISOs and security architects preparing for the post-GTIG threat landscape:
- Assume every open-source administrative tool in your perimeter is a candidate target for AI-assisted vulnerability hunting; prioritize coordinated disclosure relationships and emergency-patch pipelines with the vendors of your top 20 such tools.
- Audit your AI gateway as an identity-adjacent control surface; apply the same hardening, monitoring, and incident-response coverage you apply to your SSO provider, not the lighter posture typical of developer tooling.
- Treat PyPI, npm, and similar package compromises as your highest-velocity supply chain risk for AI development, given the LiteLLM-class compromises documented in the GTIG report; deploy package integrity verification across CI/CD pipelines and require maintainer-level provenance for AI-adjacent dependencies.
- Map your detection stack to runtime-mutable malware behaviors rather than static indicators of compromise; expect samples that regenerate their behavior between deployments and design detection rules that survive that mutation.
- Negotiate enterprise-tier abuse monitoring agreements with your model providers; demand the same telemetry quality Google demonstrated in the GTIG report, including operator-level abuse detection and notification.
- Set explicit thresholds for when an AI-assisted exploitation attempt rises to a board-level disclosure event; coordinate with general counsel on disclosure obligations under SEC cyber incident reporting rules and state-level requirements.
- Build internal red-team capability that mirrors the offensive AI playbook GTIG documented; if your red team cannot produce an LLM-assisted exploit chain in a controlled environment, your blue team has no reference for what to detect.
- Track GTIG’s future quarterly reports as a leading indicator; the data velocity between the February 2026 and May 2026 reports is the relevant base rate for planning the next six months, not the snapshot from any single quarter.
In other news
Google launches Gemini Intelligence to reshape Android against Apple. Google used its May 12 Android Show: I/O Edition to introduce “Gemini Intelligence,” positioning Android as an “intelligence system” rather than an operating system, with cross-app automation rolling out first to Samsung Galaxy and Pixel devices this summer and expanding to wearables, autos, and laptops later in 2026 — just weeks before Apple’s expected Gemini-powered Siri reboot at WWDC (CNBC).
OpenAI launches a $4 billion Deployment Company and acquires Tomoro. OpenAI on May 11 unveiled the OpenAI Deployment Company, a $4 billion enterprise services venture co-led by TPG with Advent, Bain Capital, and Brookfield, and announced the acquisition of Scottish applied-AI firm Tomoro to seed the new entity with approximately 150 Forward Deployed Engineers. The move directly targets McKinsey, Accenture, and Anthropic’s nascent professional-services revenue (The Next Web).
Anthropic expands a multibillion-dollar compute deal with Google and Broadcom. Anthropic announced a renewed and expanded partnership with Google Cloud and Broadcom for custom TPU-based training infrastructure, deepening its dependence on Google compute as inference demand for Claude continues to accelerate. The deal is the latest in a sequence of Anthropic compute commitments that have totaled tens of billions over the past year (Anthropic).
Google’s Gemini powers an AI-built Siri redesign at Apple. Apple confirmed on May 6 that the redesigned Siri shipping with iOS 27 will be powered by Google’s Gemini model under a multi-year licensing deal, an admission that Apple is leaning on a direct rival to close the personal-AI gap and a notable inversion of the historical Apple Maps-style insourcing pattern (9to5Mac).
Pentagon awards AI ceiling contracts to eight Big Tech vendors, snubs Anthropic. The Department of Defense awarded major AI capability contracts to eight large technology providers including Microsoft, Google, and OpenAI on May 1, while pointedly excluding Anthropic from the top tier despite an Anthropic-led bid being viewed as a frontrunner earlier in the year. The contracts cement the federal government as the largest single AI buyer of 2026 (CNN).