Catastrophic Cyber Capabilities Benchmark (3CB): Robustly Evaluating LLM Agent Cyber Offense Capabilities
Keywords: AI Safety, Benchmarks, Cybersecurity
TL;DR: The first benchmark evaluating AI Agent's cyber capabilities with a taxonomy
Abstract: LLM agents have the potential to revolutionize defensive cyber operations, but
their offensive capabilities are not yet fully understood. To prepare for emerging
threats, model developers and governments are evaluating the cyber capabilities
of foundation models. However, these assessments often lack transparency and
a comprehensive focus on offensive capabilities. In response, we introduce the
Catastrophic Cyber Capabilities Benchmark (3CB), a novel framework designed
to rigorously assess the real-world offensive capabilities of LLM agents. Our
evaluation of modern LLMs on 3CB reveals that frontier models, such as GPT-4o
and Claude 3.5 Sonnet, can perform offensive tasks such as reconnaissance and
exploitation across domains ranging from binary analysis to web technologies.
Conversely, smaller open-source models exhibit limited offensive capabilities. Our
software solution and the corresponding benchmark provides a critical tool to
reduce the gap between rapidly improving capabilities and robustness of cyber
offense evaluations, aiding in the safer deployment and regulation of these powerful
technologies.
Submission Number: 17
Loading