CaMo: Capturing the modularity by end-to-end models for Symbolic Regression

Published: 01 Jan 2025, Last Modified: 26 Jan 2025Knowl. Based Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Modularity is a ubiquitous principle that permeates various aspects of nature, society, and human endeavors, from biological systems to organizational structures and beyond. In the context of Symbolic Regression, which aims to find the explicit expressions from observed data, modularity could be viewed as a type of knowledge to capture the salient substructure to achieve higher fitting results. Symbolic Regression is essentially a composition optimization problem thus remaining valuable sub-structures can provide efficiency to the subsequent search. In this paper, we propose to acquire modularity in a search process and use the term module indicating the useful sub-structure. Specifically, the end-to-end model is chosen to incorporate the module into the search procedure for its scalability and generalization ability. Modules are considered high-order knowledge and act as fundamental operators, expanding the search library of Symbolic Regression. The proposed algorithm enables self-learning or self-evolution of modules as part of the learning component. Additionally, a module extraction strategy generates modules hierarchically from the expression tree, along with a module update mechanism designed to eliminate unnecessary modules while incorporating new useful ones effectively. Experiments were conducted to evaluate the effectiveness of each component.
Loading