Two weeks ago, Anthropic announced that its new model, Claude Mythos Preview, can autonomously find and weaponize software vulnerabilities, turning them into working exploits without expert guidance. These vulnerabilities were present in critical software such as operating systems and internet infrastructure—flaws that thousands of developers had previously overlooked.
This capability poses significant security risks to the devices and services we rely on daily. As a result, Anthropic is not releasing the model to the general public. Instead, access is limited to a small number of companies.
The Cybersecurity Community Reacts
The announcement sent shockwaves through the internet security community. However, Anthropic provided few details, leaving many observers frustrated and skeptical. Some speculate that Anthropic lacks the necessary GPUs to run the model at scale, using cybersecurity concerns as a justification to limit its release. Others argue that the company is adhering to its AI safety mission. Amidst the hype and counter-hype, reality and marketing blur together, making it difficult even for experts to separate fact from fiction.
We view Mythos as a real but incremental advancement—one step in a long progression of incremental improvements. Yet even small steps can hold significant weight when viewed in the broader context.
How AI Is Reshaping Cybersecurity
We’ve previously discussed Shifting Baseline Syndrome, a phenomenon where people—both the public and experts—dismiss long-term, transformative changes that occur gradually. This has already happened with online privacy, and it’s now unfolding with AI. While vulnerabilities discovered by Mythos could theoretically have been found using AI models from last month or last year, they would have been impossible to detect with models from just five years ago.
The Mythos announcement underscores how far AI has advanced in a short time: the baseline has truly shifted. Today’s large language models excel at tasks like identifying vulnerabilities in source code. Whether it happened last year or will happen next year, it was inevitable that this capability would emerge soon. The real challenge lies in how we adapt to it.
Offense vs. Defense: A Nuanced Battlefield
We do not believe that an AI capable of autonomous hacking will create a permanent imbalance between offensive and defensive cybersecurity capabilities. The reality is more complex:
- Some vulnerabilities can be found, verified, and patched automatically.
- Others will be hard to find but easy to verify and patch—such as generic cloud-hosted web applications built on standard software stacks, where updates can be deployed rapidly.
- Some vulnerabilities will be easy to find and verify but difficult or impossible to patch, particularly in IoT appliances and industrial equipment that are rarely updated or cannot be easily modified.
- Finally, there are systems where vulnerabilities are easy to find in code but difficult to verify in practice. For example, complex distributed systems and cloud platforms may consist of thousands of interacting services running in parallel, making it challenging to distinguish real vulnerabilities from false positives or to reliably reproduce them.
To address these challenges, we must categorize vulnerabilities based on their patchability and the ease of verification. This taxonomy provides a framework for understanding the evolving cybersecurity landscape.