Keywords: AI for research, AI for Biology, Agents, safety, openclaw, skills
TL;DR: We present BioSkillSafety, the first systematic framework for evaluating skill-based agent safety in bioinformatics domains.
Abstract: LLM agents have rapidly emerged as transformative tools for biomedical research, yet their
safety risks in bioinformatics-specific contexts
remain unexplored. We present **BioSkillSafety,
the first systematic framework for evaluating
skill-based agent safety in bioinformatics domains.** Our six-layer taxonomy achieves 100%
coverage across 13 attack cases spanning genomics, transcriptomics, clinical, infrastructure,
and external communication domains. Through
429 trials across 11 models and 3 real-world
skill repositories, we reveal that all skill libraries exhibit consistent vulnerabilities, model
safety varies significantly with backbone selection, and domain-specific patterns demand targeted safeguards. These findings establish standardized benchmarks for trustworthy deployment of biomedical AI agents, contributing to
safer and more reliable AI-assisted biomedical
research.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 22
Loading