TAP attack success rate (calendar agent): before vs. after spotlighting + classifier
94.6% → 6.2%
Average ASR reduction from adversarial training alone (Gemini 2.5 vs 2.0)
47%
OWASP LLM Top 10 rank for prompt injection — 2024 and 2025 consecutively
#1



围绕这条内容继续补充观点或上下文。