Keywords: biosafety, generative models, protein design, biosecurity, benchmarks, RFdiffusion
TL;DR: A benchmark framework to stress-test generative protein binder models along biosafety dimensions of refusal, plausibility, safety distance, and robustness.
Abstract: Generative AI has transformed protein binder design, enabling rapid creation of compact proteins with high predicted foldability and affinity. Yet these advances raise biosafety concerns: current models lack refusal mechanisms, treat benign and hazardous specifications equivalently, and are easily exploitable by adversarial prompting. We introduce a governance-aware benchmark for stress-testing generative protein design models. Input specifications are stratified into three layers inspired by biosafety levels—Benign, Ambiguous, and Malicious—and evaluated along four orthogonal dimensions: Refusal, Plausibility, Safety Distance, and Adversarial Robustness. Results are reported in a Spec × Metric matrix that highlights cross-layer safety gaps without disclosing sensitive sequences. A pilot evaluation with RFdiffusion shows no refusal, plausibility scores insensitive to biosafety level, trivial robustness, and stratification only in safety distance. These findings underscore the absence of intrinsic biosafety alignment in current structural generators. Grounded in established biosafety frameworks, this benchmark provides a reproducible foundation for community standards at the intersection of generative biology, AI safety, and governance.
Submission Number: 35
Loading