Evolutionary Alpha Factor Discovery with Large Language Models for Sparse Portfolio Optimization

Haochen Luo; Yuan Zhang; Chen Liu

Evolutionary Alpha Factor Discovery with Large Language Models for Sparse Portfolio Optimization

Haochen Luo, Yuan Zhang, Chen Liu

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Portfolio Optimization, Alpha Factor Mining, Quantitative Investment, Large Language Models, Evolutionary Searching

Abstract: Sparse portfolio optimization remains a fundamental yet difficult challenge in quantitative finance, as traditional approaches that rely on historical return estimates and static objectives often struggle to adapt to shifting market dynamics. To address this, we propose a new framework that leverages large language models (LLMs) to automate the discovery and iterative refinement of alpha factors tailored for sparse portfolio construction. By reformulating asset selection as a top-m ranking problem guided by factor signals, the framework integrates an evolutionary feedback loop to continuously enhance the factor pool based on performance. Extensive experiments across five Fama–French benchmark datasets and two real-world datasets (US and China) show that our approach consistently outperforms both statistical and optimization-based baselines, particularly in high-volatility and large-universe settings. Ablation studies further highlight the importance of prompt design, factor diversity, and the choice of LLM backend. These results suggest that language-model-guided factor generation offers a promising, interpretable, and adaptive solution for portfolio optimization under sparsity constraints.

Supplementary Material: zip

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 6895

Loading