Mining Valuable Sub-Expressions for Symbolic Regression

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Symbolic regression, Reinforcement learning
Abstract: Symbolic Regression (SR) aims to discover mathematical expressions from data, but classical methods are hampered by an immense search space. This inefficiency stems from their tendency to construct expressions atom-by-atom using basic operators and variables, overlooking the power of reusing meaningful sub-expressions. To address this challenge, we introduce Mining Sub-Expression Symbolic Regression (MSSR), a novel framework that discovers and leverages valuable sub-expressions to efficiently search for the correct symbolic form. MSSR employs a cooperative multi-agent reinforcement learning framework, augmented with genetic programming, to intelligently sample sub-expressions from a dynamically evolving library, combining them into a mathematical expression. A pruning mechanism based on the coefficient of variation is utilized to remove redundant terms, promoting the discovery of the parsimonious expression. We conduct extensive experiments on the SRBench and fluid dynamics benchmarks. The results demonstrate that, compared to 24 baseline methods, MSSR recovers more ground-truth expressions and achieves a superior balance between predictive accuracy and model simplicity.
Supplementary Material: zip
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 8438
Loading