Abstract: Symbolic equations are crucial for scientific discovery. Symbolic regression, the task of extracting underlying mathematical expressions from data, represents a challenge in artificial intelligence. Although recent algorithms integrating symbolic regression with neural networks have emerged in the machine learning community, these approaches primarily focus on small-scale dependencies among symbols, neglecting relationships between larger scales (such as among substructures) and interactions between small and large scales (such as between symbols and substructures). A single-scale generation model can lead to redundant expression structures and convergence oscillations. This paper introduces Neuro-Encoded Expression Programming with Automatically Defined Functions (NEEP-ADF), a novel method addressing these challenges by learning multi-scale relationships. The NEEP-ADF method is based on two core ideas: 1) Symbols form reusable substructure modules through small-scale dependencies. 2) The model captures large-scale relationships among substructures to adapt to specific target problems. This multi-scale approach endows NEEP-ADF with flexible scalability, enabling it to dynamically adjust the complexity of solutions through symbols and substructures, thereby effectively addressing the problem of unknown scale. In a series of benchmark tests encompassing synthetic and real-world benchmarks, both versions of NEEP-ADF (Evolutionary Computation and Reinforcement Learning) demonstrated the state-of-the-art performance and convergence speed among the compared algorithms.
Loading