AI Cybersecurity Capabilities Surge Past Expectations

Two leading artificial intelligence models—Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5—have significantly outpaced the rapid advancements in autonomous cybersecurity tasks, according to independent evaluations published on Wednesday, July 16, 2025.

The findings, released by the UK’s AI Security Institute (AISI) and Palo Alto Networks, indicate that both models have exceeded the doubling trend AISI had been tracking since late 2024. Whether this represents a temporary spike or the beginning of a steeper trajectory remains uncertain.

AISI Reports Accelerated AI Autonomy in Cyber Tasks

The AISI, which evaluates frontier AI models for the British government, previously estimated that the time required for AI to complete cybersecurity tasks autonomously was halving every five months—roughly twice as fast as the eight-month doubling time recorded in November 2025. The new results suggest an even more rapid progression.

“Frontier AI’s autonomous cyber and software capability is advancing quickly: the length of cyber tasks that frontier models can complete autonomously has doubled on the order of months, not years.”

The AI Security Institute

Breakthrough Performance in Simulated Cyber Attacks

The most compelling evidence came from AISI’s cyber ranges, which simulate multi-stage attacks on undefended enterprise networks. In these tests:

  • Claude Mythos Preview became the first model to complete both of AISI’s cyber ranges:
    • Solved “The Last Ones”, a 32-step simulated corporate network attack, in 6 out of 10 attempts.
    • Completed “Cooling Tower”—a previously unsolved challenge—for the first time, succeeding in 3 out of 10 attempts.
  • GPT-5.5 solved “The Last Ones” in 3 out of 10 attempts.

Palo Alto Networks Confirms AI’s Growing Threat

Palo Alto Networks, which began testing Claude Mythos in April 2025 as part of Anthropic’s Project Glasswing, corroborated AISI’s findings. The company also tested Claude Opus 4.7 and OpenAI’s GPT-5.5-Cyber as part of OpenAI’s Trusted Access for Cyber program.

“The latest models are extraordinarily capable at finding vulnerabilities and converting them into critical exploit paths in near-real-time.”

Palo Alto Networks

Through AI-driven scanning across over 130 products, Palo Alto Networks identified 26 CVEs—covering 75 vulnerabilities—in a single evaluation cycle. This volume far exceeds the typical monthly discovery of fewer than five CVEs. All critical vulnerabilities in its SaaS products were patched, with updates available for customer-operated systems.

Researchers Acknowledge Study Limitations

The AISI noted that its estimates are based on a small sample of models and limited human comparison data for the most complex tasks. However, the institute emphasized that the overall trend remains robust: excluding any single model shifts the estimated doubling time by less than a month.

Broader Implications for AI and Cybersecurity

Separate research from METR, a nonprofit tracking AI’s software-handling capabilities, suggests that the rapid advancements may extend beyond cybersecurity. The findings underscore the need for updated frameworks to assess and mitigate risks posed by increasingly autonomous AI systems.

Source: CyberScoop