Improved Theoretically-Grounded Evolutionary Algorithms for Subset Selection with a Linear Cost Constraint

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Abstract: The subset selection problem with a monotone and submodular objective function under a linear cost constraint has wide applications, such as maximum coverage, influence maximization, and feature selection, just to name a few. Various greedy algorithms have been proposed with good performance both theoretically and empirically. Recently, evolutionary algorithms (EAs), inspired by Darwin's evolution theory, have emerged as a prominent methodology, offering both empirical advantages and theoretical guarantees. Among these, the multi-objective EA, POMC, has demonstrated the best empirical performance to date, achieving an approximation guarantee of $(1/2)(1-1/e)$. However, there remains a gap in the approximation bounds of EAs compared to greedy algorithms, and their full theoretical potential is yet to be realized. In this paper, we re-analyze the approximation performance of POMC theoretically, and derive an improved guarantee of $1/2$, which thus provides theoretical justification for its encouraging empirical performance. Furthermore, we propose a novel multi-objective EA, EPOL, which not only achieves the best-known practical approximation guarantee of $0.6174$, but also delivers superior empirical performance in applications of maximum coverage and influence maximization. We hope this work can help better solving the subset selection problem, but also enhance our theoretical understanding of EAs.
Lay Summary: Selecting the best subset of items (e.g., features in a dataset or influencers in a social network) while balancing quality and cost is a common challenge in AI. Greedy algorithms—step-by-step methods—have long been used for this, but evolutionary algorithms (EAs), inspired by natural selection, have recently shown promise. One EA, called POMC, works well in practice but had a weaker theoretical guarantee than greedy methods. In this work, we improved POMC’s theoretical guarantee, proving it can achieve at least half the optimal solution’s value. We also designed a new EA, EPOL, which not only matches the best-known theoretical performance (61.74% of optimal) but also outperforms existing methods in real-world tasks like identifying key influencers or selecting important data features. Our findings advance both the practical use and theoretical understanding of evolutionary algorithms, offering better tools for subset selection problems. This work bridges the gap between theory and practice, helping researchers and practitioners make smarter, data-driven choices.
Link To Code: https://github.com/lamda-bbo/EPOL
Primary Area: Optimization->Discrete and Combinatorial Optimization
Keywords: submodular optimization, linear constraints, multi-objective evolutionary algorithms
Submission Number: 11929
Loading