TLDR: Lakera, in collaboration with Check Point Software Technologies and the UK AI Security Institute, has launched the ‘backbone breaker benchmark’ (b3). This open-source security evaluation tool is designed to assess the vulnerabilities of Large Language Model (LLM) backends within AI agents. The b3 benchmark utilizes ‘threat snapshots’ and a dataset of nearly 20,000 crowdsourced adversarial attacks to identify weaknesses like prompt exfiltration and malicious code injection, aiming to enhance the security posture of AI applications.
Lakera, a leading AI-native security platform and a Check Point company, has announced the release of the ‘backbone breaker benchmark’ (b3), an open-source security evaluation tool developed in partnership with Check Point Software Technologies and researchers from The UK AI Security Institute (AISI). This innovative benchmark is specifically engineered to address the critical security challenges posed by Large Language Model (LLM) backends operating within AI agents.
The b3 benchmark introduces a novel concept called ‘threat snapshots.’ Unlike traditional methods that simulate an entire AI agent workflow, threat snapshots focus on pinpointing crucial moments where LLM vulnerabilities are most likely to manifest. This targeted approach allows developers and model providers to efficiently evaluate their systems’ resilience against realistic adversarial attacks without the extensive overhead of full agent workflow modeling.
Mateo Rojas-Carulla, Co-Founder and Chief Scientist at Lakera, emphasized the necessity of this benchmark, stating, “We built the b3 benchmark because today’s AI agents are only as secure as the LLMs that power them. Threat Snapshots allow us to systematically surface vulnerabilities that have until now remained hidden in complex agent workflows. By making this benchmark open to the world, we hope to equip developers and model providers with a realistic way to measure, and improve, their security posture.”
The benchmark integrates 10 distinct agent ‘threat snapshots’ with a comprehensive dataset comprising 19,433 crowdsourced adversarial attacks. These attacks were gathered through ‘Gandalf: Agent Breaker,’ a gamified red-teaming exercise. The b3 benchmark assesses susceptibility to various attack vectors, including system prompt exfiltration, insertion of phishing links, malicious code injection, denial-of-service, and unauthorized tool calls.
Also Read:
- OWASP Unveils Groundbreaking AI Vulnerability Scoring System at Global AppSec
- Cybersecurity Alert: Malicious Actors Exploit AI Agent Identities to Evade Bot Defenses
Initial testing of 31 popular LLMs using the b3 benchmark has yielded significant insights. The findings indicate that enhanced reasoning capabilities in LLMs correlate with improved security performance. Notably, the size of an LLM does not appear to be a determining factor in its security performance. This benchmark is poised to become a vital tool for developers and organizations striving to build more secure and robust AI agent applications.


