Google's Threat Intelligence Group has published what it describes as the first documented case of a threat actor deploying a zero-day exploit that was likely developed with the assistance of an artificial intelligence model. The company says its early detection may have prevented a planned mass exploitation campaign from ever launching.
The vulnerability, detailed in a report published Monday, targeted a popular open-source web-based system administration tool. According to Google, the flaw would have allowed attackers to bypass two-factor authentication checks after obtaining valid credentials. The exploit was implemented as a Python script.
Google declined to name the targeted software vendor, the criminal group involved, or the specific AI model used. The company stated it has "high confidence" an AI model assisted in discovering and weaponizing the vulnerability, though it does not believe its own Gemini or Anthropic's Claude Mythos were involved.
Counter-Discovery Disrupts the Campaign
The company worked with the unnamed vendor to patch the issue before the attack could proceed. Google also notified law enforcement. The vulnerability stemmed from what the report describes as a hard-coded trust exception in the authentication flow, creating a logic error that traditional fuzzing tools would have missed.
John Hultquist, chief analyst at Google Threat Intelligence Group, characterized the discovery bluntly in interviews. "The era of AI-driven vulnerability and exploitation is already here," he said. "For every zero-day we can trace back to AI, there are probably many more out there."
The distinction matters. Large language models have shown aptitude for identifying the kind of high-level logic flaws that automated scanning tools struggle with. While fuzzers excel at finding crashes and memory corruption, frontier models can reason about authentication flows, trust assumptions, and design-level mistakes. That capability is now showing up in criminal operations.
State Actors Are Already Experimenting
The report also documents broader trends in AI-assisted offensive security. North Korea's APT45 has reportedly sent thousands of repetitive prompts to recursively analyze CVEs and validate proof-of-concept exploits. A Chinese actor known as UNC2814 used what Google calls a "persona-driven jailbreak" to research vulnerabilities in TP-Link firmware. The company observed threat actors experimenting with a GitHub repository containing over 5,000 real-world vulnerability cases from China's WooYun disclosure platform.
Google, for its part, is deploying AI defensively as well. Its Big Sleep agent, developed by Google DeepMind and Google Project Zero, actively searches for unknown vulnerabilities in software. The company says Big Sleep assisted in finding the vulnerability that GTIG was able to cut off. Google has also introduced CodeMender, an experimental agent designed to automatically patch critical code flaws using Gemini's reasoning capabilities.
The broader context here is a security landscape where model access increasingly determines offensive capability. Criminal hackers, Hultquist noted, have more to gain from AI's speed than slow-moving state espionage operations. Ransomware timelines are compressing. The window between vulnerability discovery and exploitation has already largely vanished.
Google's report notes that mistakes in the exploit's implementation may have interfered with the criminals' plans in this case. That clumsiness probably won't last. The AI security arms race has entered a phase where both attackers and defenders are deploying increasingly capable models. The difference is which side moves faster.
For defenders, the message is straightforward: the traditional disclosure-to-patch window is no longer a planning assumption. It is a race condition, and AI is now running on both sides of it.


