Lasso Security compromised Perplexity’s BrowseSafe guardrail model for AI browsers, proving that "out-of-the-box" tools fail to stop prompt injection attacks.
The 36% bypass rate is concerning but not surprising. Prompt injection defense needs multiple layers — decoding, red-teaming, behavioral constraints, and monitoring — not just another LLM acting as a filter.
It's a harsh reality but the truth is, we are never completely secure online. We have to stay vigilant, especially in this age, because the bad actors are always going to be looking for exploits (and finding them, when security gets lax). I know it's exhausting, but it's the reality. Prompt injections seem to be the new trojan horses.
DAN techniques keep evolving. they have been really “fun” and creative. i’m glad to see people talking about them as a real threat to guard against.
The 36% bypass rate is concerning but not surprising. Prompt injection defense needs multiple layers — decoding, red-teaming, behavioral constraints, and monitoring — not just another LLM acting as a filter.
It's a harsh reality but the truth is, we are never completely secure online. We have to stay vigilant, especially in this age, because the bad actors are always going to be looking for exploits (and finding them, when security gets lax). I know it's exhausting, but it's the reality. Prompt injections seem to be the new trojan horses.