As organizations increasingly adopt agentic AI for business and IT operations, researchers continue to uncover vulnerabilities in major commercial models that expand potential attack surfaces. This week, Pillar Security disclosed a critical flaw in Antigravity, Google’s AI-powered developer tool for filesystem operations.
The vulnerability, now patched by Google, combined prompt injection with Antigravity’s file-creation capabilities to grant attackers remote code execution (RCE) privileges. Researchers demonstrated how the exploit bypassed Secure Mode, Google’s highest security setting for AI agents, which is designed to restrict access to sensitive systems and prevent malicious shell commands.
How the Exploit Circumvented Secure Mode
Secure Mode is intended to limit AI agent capabilities by routing all command operations through a virtual sandbox environment, throttling network access, and preventing code execution outside the working directory. However, the exploit leveraged a file-searching tool called “find_by_name”, classified as a ‘native’ system tool.
Because native tools execute directly before Secure Mode can evaluate commands, the security boundary was bypassed entirely.
"The security boundary that Secure Mode enforces simply never sees this call," wrote Dan Lisichkin, an AI security researcher at Pillar Security. "This means an attacker achieves arbitrary code execution under the exact configuration a security-conscious user would rely on to prevent it."
Delivery Methods and Attack Vectors
Prompt injection attacks can be delivered through:
- Compromised identity accounts linked to the AI agent
- Malicious instructions hidden in open-source files or web content
- Unvalidated input from documents or files the agent ingests
Antigravity struggles to distinguish between contextual data and literal prompt instructions, allowing compromise without elevated access. For example, an attacker could embed malicious prompts in a seemingly harmless file, which the agent would then execute.
Timeline and Industry Impact
According to Pillar Security’s disclosure timeline, the vulnerability was reported to Google on January 6, 2025, and patched by February 28, 2025. Google awarded a bug bounty for the discovery.
Lisichkin noted that similar prompt injection vulnerabilities have been found in other coding AI agents, such as Cursor. He emphasized that in the era of autonomous AI, unvalidated input can become a malicious prompt capable of hijacking internal systems.
"The trust model underpinning security assumptions—that a human will catch something suspicious—does not hold when autonomous agents follow instructions from external content," Lisichkin wrote.
The exploit’s ability to bypass Google’s Secure Mode highlights a critical gap in current cybersecurity practices. Lisichkin stressed that the industry must move beyond sanitization-based controls and adopt stricter auditing measures.
"Every native tool parameter that reaches a shell command is a potential injection point. Auditing for this class of vulnerability is no longer optional—it is a prerequisite for shipping agentic features safely," he wrote.