Keywords: AI evaluation, sociotechnical safety, risk assessment, capability evaluations, ecological validity, AI governance, safety institutes, evaluation methodology, cyber risk, biological risk, agentic AI
TL;DR: We introduce risk chain analysis to show how different evaluation actors and communities can coordinate to cover the full pathway from AI capability to real-world harm, arguing that AI safety institutes are best positioned to coordinate this effort.
Abstract: Capability evaluations cover only a narrow portion of the pathway from AI capability to real-world harm. We introduce risk chain analysis — tracing harm trajectories step by step and mapping evaluation evidence at each stage — and apply it to three risk domains: cyber attacks, biological risk, and agentic AI failures. Evaluation consistently concentrates where benchmarking is easiest, not where risk is highest. The uncovered steps require different methods and different actors: sectoral regulators, deploying organisations, and domain-specific agencies. We argue that AI safety institutes are uniquely positioned to coordinate this distributed evaluation ecosystem.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Type: Research Paper
Archival Status: Archival
Submission Number: 90
Loading