Keywords: Alpha Mining, Agentic AI, Quantitative Investment, Self-evolving Agent
TL;DR: AlphaAgentEvo introduces a new evolution-oriented paradigm for alpha mining via self-evolving agentic reinforcement learning, outperforming traditional and LLM baselines—even surpassing state-of-the-art LLMs with only 1.7B–4B parameters.
Abstract: Alpha mining seeks to identify predictive alpha factors that generate excess returns relative to the market from a vast and noisy search space; however, existing evolution-based approaches struggle to facilitate the systematic evolution of alphas. Traditional methods, such as Genetic Programming (GP), cannot interpret natural language instructions and often fail to extract valuable insights from unsuccessful attempts, leading to low interpretability and inefficient exploration. Analogously, without mechanisms for systematic evolution, e.g., long-term planning and reflection, existing multi-agent approaches may easily fall into repetitive evolutionary routines, resulting in inefficient evolution. To overcome these limitations, we introduce AlphaAgentEvo, a self-evolving Agentic Reinforcement Learning (ARL) framework for alpha mining, which moves alpha mining beyond the brittle search-backtest-restart cycle toward a continuous trajectory of evolution. Guided by a hierarchical reward function, our agent engages in self-exploration of the search space, progressively learning basic requirements (e.g., valid tool calls) and then harder objectives (e.g., continuous performance improvements). Through this process, the agent acquires advanced behaviors such as long-horizon planning and reflective reasoning, which enable it to actively react to the underlying state (e.g., market regime shifts) and realize a self-evolving agent, marking a step toward more principled and scalable alpha mining. Extensive experiments demonstrate that AlphaAgentEvo achieves more efficient alpha evolution and generates diverse and transferable alphas, consistently surpassing a wide range of baselines. Notably, with only 4B parameters, it outperforms LLM-driven evolution methods configured with state-of-the-art closed-source reasoning models, highlighting the promise of ARL for next-generation alpha mining.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 3446
Loading