Keywords: AI scientist; agents; LLM agents; LLM; safety
Abstract: The emergence of autonomous AI systems capable of conducting scientific research has introduced unprecedented opportunities and risks in the scientific enterprise. This survey examines the current state of AI scientist safety, analyzing major safety concerns, documented incidents, and emerging challenges in automated scientific discovery systems. Through comprehensive literature review and case study analysis, we identify four critical risk categories: technical failures and hallucinations, dual-use and misuse potential, research integrity violations, and autonomous system alignment problems. Our analysis reveals that while current AI systems like GPT-4, Claude, and specialized research agents demonstrate remarkable capabilities, they exhibit concerning failure modes including systematic hallucination rates (1.7-33\%), research fabrication, and dangerous autonomous behaviors. We propose a framework for evaluating AI scientist safety and provide recommendations for safer deployment of automated research systems.
Submission Number: 133
Loading