“The agent accesses the site to learn from it, but the web page contains the attacker’s payload: a prompt injection instruction (to prevent users from seeing the payload, the attacker can use different styling techniques to hide it, such as setting the font size to a very small number). The agent ingests this hidden text, which manipulates it into believing it must collect and submit data to a fictitious tool to complete the user’s request.”
Excellent summary of the attack. I will never use agentic browser that’s logged into Substack or shopping lol!
It just takes one little thing. Thank you for this. It's a little over my head since I'm not a developer, but it's a perfect example of the inherent risks. And most AI tool users, like me, will not understand what's going on "under the hood." 😟
The cat command workaround to bypass the .gitignore protection is pretty revealing about how these agents approach constraints. Karpathy's jagged intelligence concept really captures this, when the agent is clever enought to find loopholes but lacks any context about why it shoudnt. The webhook.site being on the default allowlist seems like an obvios oversight though.
“The agent accesses the site to learn from it, but the web page contains the attacker’s payload: a prompt injection instruction (to prevent users from seeing the payload, the attacker can use different styling techniques to hide it, such as setting the font size to a very small number). The agent ingests this hidden text, which manipulates it into believing it must collect and submit data to a fictitious tool to complete the user’s request.”
Excellent summary of the attack. I will never use agentic browser that’s logged into Substack or shopping lol!
It just takes one little thing. Thank you for this. It's a little over my head since I'm not a developer, but it's a perfect example of the inherent risks. And most AI tool users, like me, will not understand what's going on "under the hood." 😟
The cat command workaround to bypass the .gitignore protection is pretty revealing about how these agents approach constraints. Karpathy's jagged intelligence concept really captures this, when the agent is clever enought to find loopholes but lacks any context about why it shoudnt. The webhook.site being on the default allowlist seems like an obvios oversight though.