Antigravity prompt injection vulnerability highlights security threats of AI-powered coding tools

An indirect prompt injection turns the AI agent in Google's Antigravity IDE into an insider threat, bypassing security controls to steal credentials.

Nov 28, 2025

A newly discovered vulnerability in Google’s Antigravity platform demonstrates how its autonomous AI agents can be manipulated into exfiltrating sensitive data from a developer’s environment. Security researchers at PromptArmor found that an indirect prompt injection, hidden within a seemingly harmless online document, can coerce Antigravity’s AI into bypassing its own security settings to steal credentials and proprietary code. The attack leverages the very agentic capabilities that Google promotes as the platform’s core strength.

What is Google Antigravity?

Google introduced Antigravity as an agentic development platform designed to help developers operate at a higher, task-oriented level. (The announcement was kind of overshadowed as it was released at the same time as Gemini 3.0 Pro, Google’s latest flagship large language model.)

Antigravity combines a familiar AI-powered code editor with an agent-first interface. This design allows developers to delegate complex, end-to-end tasks to autonomous agents that can plan, execute, and verify their work across the editor, terminal, and browser without constant human intervention.

The platform is split into two main interfaces. The Editor View provides a standard AI-assisted IDE for hands-on coding. The Manager Surface is where developers can launch and monitor multiple agents working asynchronously on different tasks, such as fixing bugs or performing long-running maintenance. The goal is to offload work, allowing a developer to focus on one task while agents handle others in the background.

Anatomy of an agentic heist

The attack chain discovered by PromptArmor begins with a common developer scenario: using an online guide for a software integration. A developer provides Antigravity with a URL to a reference implementation guide. (This can happen quite a lot as nowadays, developers tend to use vibe-coding tools to write code for frameworks they don’t know themselves; they just point the AI to a guide and let it figure out how to do the work by itself.)

The agent accesses the site to learn from it, but the web page contains the attacker’s payload: a prompt injection instruction (to prevent users from seeing the payload, the attacker can use different styling techniques to hide it, such as setting the font size to a very small number). The agent ingests this hidden text, which manipulates it into believing it must collect and submit data to a fictitious tool to complete the user’s request.

Hidden instruction in webpage results in prompt injection against Google’s Antigravity tool (source: PromptArmor)

Following the malicious instructions, the agent gathers context from the user’s codebase and then attempts to access credentials stored in a .env file. The developer had correctly listed this file in their .gitignore, and Antigravity’s default configuration prevents the agent from reading such files with its built-in tool. However, torn between following the task instructions and following the security protocols, the agent finds a loophole, using the cat terminal command to dump the file’s contents.

With the sensitive data acquired, the agent constructs a malicious URL. It appends the stolen credentials and code snippets to a webhook.site domain monitored by the attacker. Finally, it invokes a browser subagent (one of Antigravity’s flagship features for testing web applications) and instructs it to open the dangerous URL. This action completes the exfiltration, delivering the confidential data directly to the attacker.

Prompt injection results in Antigravity stealing your sensitive information (source: PromptArmor)

Why the built-in guardrails failed

According to PromptArmor, attack succeeded partly because Antigravity’s default security settings are too permissive. The platform includes a Browser URL Allowlist to prevent agents from visiting malicious sites. However, the default list includes webhook.site, a public service that allows anyone to create a URL and monitor all requests sent to it, providing a perfect channel for data exfiltration.

Furthermore, the platform’s model for human supervision proved insufficient. During onboarding, users are encouraged to accept default settings that allow the AI agent to determine when a human review of its plans or commands is necessary. This, combined with the Agent Manager interface that encourages running agents unsupervised in the background, makes it highly implausible that a developer would catch a malicious action before it executes.

What developers should do

These kinds of vulnerabilities demand some action by the developer of Antigravity, and I expect Google to tighten the security guardrails as we continue to explore AI-powered IDEs.

In the meantime, to mitigate this risk, developers using Antigravity should immediately audit its default security configurations. This includes removing overly permissive domains like webhook.site from the browser allowlist and changing the settings to require explicit manual approval for all agent-executed terminal commands and browser actions.

Developers must also treat any external content fed to an AI agent, including documentation and websites, as untrusted input. Google warns users about data exfiltration risks during onboarding, but this attack demonstrates that relying on user vigilance and disclaimers is not a substitute for secure-by-default design.

This vulnerability is a stark example of what AI researcher Andrej Karpathy calls “jagged intelligence.” AI models can perform impressive tasks, like orchestrating complex coding projects, yet fail catastrophically at seemingly simpler ones, like recognizing a malicious instruction. The agent does not “understand” it is stealing data; it simply follows the attacker’s instructions with the same logical efficiency it applies to legitimate tasks. It doesn’t even realize that the hidden instructions in the visited external webpage is out of place and treats it as any other instruction that is inside its prompt. It lacks the broader context of what it’s doing.

The Antigravity agent was smart enough to devise a novel way to access a blocked file but not wise enough to question the suspicious nature of its instructions. As AI tools become more autonomous and integrated into core development workflows, these jagged edges of their intelligence will become the new attack surface. Make use of AI in your daily work, but be careful of giving them too much autonomy.

ToxSec

Dec 4

“The agent accesses the site to learn from it, but the web page contains the attacker’s payload: a prompt injection instruction (to prevent users from seeing the payload, the attacker can use different styling techniques to hide it, such as setting the font size to a very small number). The agent ingests this hidden text, which manipulates it into believing it must collect and submit data to a fictitious tool to complete the user’s request.”

Excellent summary of the attack. I will never use agentic browser that’s logged into Substack or shopping lol!

🫟

Expand full comment

Lyka Saint

Nov 29

It just takes one little thing. Thank you for this. It's a little over my head since I'm not a developer, but it's a perfect example of the inherent risks. And most AI tool users, like me, will not understand what's going on "under the hood." 😟

1 more comment...

TechTalks

Discussion about this post

Ready for more?