Transforming a newly discovered software vulnerability into a cyberattack used to take months. Today—as headlines over Anthropic’s Project Glasswing have shown—generative AI can do the job in minutes, often for less than a dollar of cloud computing time.
While large language models (LLMs) present a real cyber-threat, they also provide an opportunity to reinforce cyberdefenses. Anthropic reports its Claude Mythos preview model has already helped defenders preemptively discover over a thousand zero-day vulnerabilities, including flaws in every major operating system and web browser. Anthropic coordinates disclosure and its efforts to patch the revealed flaws.
It is not yet clear whether AI-driven bug finding will ultimately favor attackers or defenders. To understand how defenders can increase their odds—and perhaps hold the advantage—it helps to look at an earlier wave of automated vulnerability discovery.
Lessons from the Fuzzing Revolution
In the early 2010s, a new category of software emerged that could attack programs with millions of random, malformed inputs—a proverbial "monkey at a typewriter," tapping on keys until it found a vulnerability. When such "fuzzers" like American Fuzzy Lop (AFL) hit the scene, they exposed critical flaws in every major browser and operating system.
The security community’s response was instructive. Rather than panic, organizations industrialized the defense. For instance, Google built a system called OSS-Fuzz that runs fuzzers continuously, around the clock, on thousands of software projects. This allowed software providers to catch bugs before they shipped, not after attackers found them.
The expectation is that AI-driven vulnerability discovery will follow the same arc: organizations will integrate the tools into standard development practice, run them continuously, and establish a new baseline for security.
The Human Cost of AI-Powered Vulnerabilities
But the analogy has a limit. Fuzzing required significant technical expertise to set up and operate—it was a tool for specialists. An LLM, meanwhile, finds vulnerabilities with just a prompt, resulting in a troubling asymmetry: attackers no longer need to be technically sophisticated to exploit code, while robust defenses still require engineers to read, evaluate, and act on what the AI models surface.
The human cost of finding and exploiting bugs may approach zero, but fixing them won’t.
Can AI Fix What It Finds?
In the opening to his book Engineering Security, Peter Gutmann observed that "a great many of today’s security technologies are ‘secure’ only because no-one has ever bothered to look at them." That observation was made before AI made looking for bugs dramatically cheaper.
Most present-day code—including the open source infrastructure that commercial software depends on—is maintained by small teams, part-time contributors, or individual volunteers with no dedicated security resources. A bug in any open source project can have significant downstream impact.
The Log4j Case: A Warning for the AI Era
In 2021, a critical vulnerability in Log4j—a logging library maintained by a handful of volunteers—exposed hundreds of millions of devices. Log4j’s widespread use meant that a vulnerability in a single volunteer-maintained library became one of the most widespread software vulnerabilities ever recorded.
The popular code library is just one example of the broader problem of critical software dependencies that have never been seriously audited. For better or worse, AI is accelerating both the discovery and exploitation of such vulnerabilities. The question now is whether defenders can keep pace with automated defenses that match the speed and scale of AI-driven attacks.