SafeDiscovery–Plans: An Open, Safety‑Constrained Scientific Planning Dataset for Agentic AI Across High‑Risk Domains
Track: Track 2: Dataset Proposal Competition
Keywords: Scientific Planning; AI for Science; Agent; Alignment; Training Dataset
Abstract: Scientific discovery routinely involves executing complex sequences of
laboratory steps while navigating institutional policies, biosafety
levels and regulatory constraints. Current language models excel at
general planning but falter when tasks demand both scientific
competence and rigorous adherence to safety rules. We introduce
SafeDiscovery–Plans, an open dataset of safety‑constrained
scientific plans designed to teach agentic AI how to transform
high‑level research goals into safe, compliant procedures. Each
example pairs a goal and laboratory setting with a validated,
stepwise plan that either accomplishes the objective or proposes a
safe redirection when it cannot be achieved under the given
constraints. Plans include personal protective equipment (PPE),
engineering controls, safe substitutions, decision points and
citations to authoritative sources. First version will contain roughly
30000 records spanning chemistry, biology and other high‑risk
domains, with a roadmap to larger scale. By supplying
structured supervision for policy‑grounded planning, SafeDiscovery–Plans
fills a critical gap between capability‑centric benchmarks and
refusal‑centric safety datasets.
Submission Number: 332
Loading