DecAEvolve: Decompose, Adapt, and Evolve, or Three Pillars of Effective LLM-based Scientific Equation Discovery
Keywords: Symbolic Regression, Equation Discovery, Large Language Models, Evolutionary Search
TL;DR: We propose DecAEvolve (Decomponse, Adapt and Evolve), an approach that leverages structural decomposition, and adaptation through RL fine-tuning to enhance efficiency of LLM-based evolutionary scientific discovery frameworks.
Abstract: Finding mathematical relations underlying natural phenomena and scientific systems has been one of the fundamental tasks in the history of scientific discovery. Recent advancements in evolutionary search with Large Language Models (LLMs), with their embedded scientific knowledge, have shown great promise for this task. However, discovering such mathematical models governing scientific observations still remains significantly challenging, as it requires navigating vast combinatorial hypothesis spaces with an explosion of possible relations. Existing LLM-based approaches overlook the impact of data on the structure of mathematical relations, and treat LLMs as a static hypothesis generator unaware of the observed scientific system. This leads to suboptimal and inefficient exploration of the hypothesis space with over-reliance on LLMs' internal priors. To bridge this gap, we introduce Decompose, Adapt, and Evolve (DecAEvolve), a framework that leverages granular feedback from symbolic term decomposition and LLM refinement through reinforcement learning (RL) fine-tuning to enhance both robustness and efficiency of evolutionary discovery frameworks. Our experiments across diverse datasets demonstrate that DecAEvolve significantly improves the accuracy of discovered equations and the efficiency of the discovery process compared to the state-of-the-art baselines.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 22651
Loading