Federal Chief Information Officer Greg Barbaccia said on Tuesday that the government is approaching Anthropic’s Mythos model with cautious optimism. While acknowledging its potential to strengthen federal cyber defenses, he emphasized the substantial uncertainties surrounding its real-world performance.
Barbaccia noted that his direct exposure to Mythos has been limited to evaluations and benchmarking tests. No federal agencies have deployed the model yet.
While the Office of the National Cyber Director is coordinating the government’s approach, Barbaccia offered a direct assessment of AI-assisted cybersecurity’s future:
“We’re going to get to a world soon where AI defense will be able to catch up. We must get to a point where the bots are finding the bots.”
Earlier this month, Barbaccia sent an email to cabinet agencies informing them that the Office of Management and Budget has begun laying the groundwork for a controlled rollout of Mythos to federal agencies.
His perspective reflects a view that the same capabilities that make Mythos a potential offensive threat are precisely what make it valuable as a defensive tool. Anthropic has stated that during testing, the model identified thousands of previously unknown, high-severity vulnerabilities across major operating systems and web browsers—many of them decades old.
Real-World Performance Remains Uncertain
The critical question for federal security teams is not whether Mythos’s capabilities are real, but whether they can translate from controlled laboratory settings to the complex, defended networks that government agencies actually operate. Barbaccia was candid about this gap:
“I think it’ll uplevel people and make a novice cybersecurity offensive operator more efficient. But the jury is still out on how effective it’ll be against real-world conditions, meaning a network that’s guarded by human defenders that has alerting and things like that. The evaluations I’ve seen have been laboratory learnings.”
This distinction is crucial for federal security teams evaluating the model. Finding a vulnerability and successfully exploiting it in a defended environment are fundamentally different challenges.
Barbaccia highlighted the CVE catalog—the government’s running list of known software flaws—as an area where Mythos’s speed could offer practical value. A human analyst reviewing the catalog would require significant time, whereas a model like Mythos could process it far faster. However, speed alone does not determine whether a vulnerability poses an actual threat.
“There’s a difference between something that is exploitable in a 4-nanosecond window during a BIOS boot versus what’s the reality of that being exploited in the real world. We have to understand, just like you could secure your entire threat surface, where are the crown jewels? And how do you protect something, and make sure the protection you’re deploying is worthwhile what you’re protecting.”
This approach is familiar to federal network defenders, who operate under resource constraints and must prioritize which vulnerabilities to address first. What Mythos potentially changes is the speed at which that triage occurs.