SkillPuzzler: A Self-Evolving Agentic Framework for Materials and Chemistry Research with Minimal Reliance on Predefined Tools

Published: 24 Sept 2025, Last Modified: 15 Oct 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: self-evolving agents, large language models, skill acquisition, materials science, chemistry
Abstract: The prevailing “large language model (LLM) + tool-use” paradigm relies on hand-crafted tool interfaces, which constrain an agent's ability to solve complex problems and hinder the adoption of agents as scientific copilots across the broader research community. We advocate a dynamic, scalable “LLM + skill-acquisition” paradigm and present SkillPuzzler as one concrete instantiation of it. SkillPuzzler combines only 4 specialized agents with 15 general-purpose tools, yet exhibits self-evolution while tackling diverse research tasks in materials science and chemistry. Its behavior is driven by prompt-encoded mindsets tailored to our customized Model Context Protocol (MCP) servers. SkillPuzzler autonomously acquires new skills by learning and adapting knowledge into self-defined tools for problem-solving. It achieves 96.7% accuracy with OpenAI O3 model on our 74-task benchmark and outperforms two baselines by a wide margin, demonstrating the robustness and effectiveness of its self-evolution mechanism.
Submission Number: 443
Loading