AlphaQCM: Alpha Discovery in Finance with Distributional Reinforcement Learning

Zhoufan Zhu; Ke Zhu

AlphaQCM: Alpha Discovery in Finance with Distributional Reinforcement Learning

Zhoufan Zhu, Ke Zhu

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: For researchers and practitioners in finance, finding synergistic formulaic alphas is very important but challenging. In this paper, we reconsider the discovery of synergistic formulaic alphas from the viewpoint of sequential decision-making, and conceptualize the entire alpha discovery process as a non-stationary and reward-sparse Markov decision process. To overcome the challenges of non-stationarity and reward-sparsity, we propose the AlphaQCM method, a novel distributional reinforcement learning method designed to search for synergistic formulaic alphas efficiently. The AlphaQCM method first learns the Q function and quantiles via a Q network and a quantile network, respectively. Then, the AlphaQCM method applies the quantiled conditional moment method to learn unbiased variance from the potentially biased quantiles. Guided by the learned Q function and variance, the AlphaQCM method navigates the non-stationarity and reward-sparsity to explore the vast search space of formulaic alphas with high efficacy. Empirical applications to real-world datasets demonstrate that our AlphaQCM method significantly outperforms its competitors, particularly when dealing with large datasets comprising numerous stocks.

Lay Summary: In finance, discovering interpretable and powerful signals (also known as formulaic alphas) for stock price prediction is both important and challenging. This task is naturally difficult by three key issues: the vast search space of possible alphas, the fact that most discovered alphas are weak, and the high correlation between the few strong alphas. Existing methods struggle to efficiently find synergistic formulaic alphas, particularly when the financial system is complex. To address these challenges, we reconsidered the alpha discovery problem from the viewpoint of sequential decision-making, and conceptualized the entire alpha discovery process as a non-stationary and reward-sparse Markov decision process. We then proposed the AlphaQCM method, a novel distributional reinforcement learning method designed to efficiently search for synergistic formulaic alphas. The core innovation of AlphaQCM is its ability to learn unbiased variance from the potentially biased quantiles, guiding the alpha discovery in the vast search space. Through experiments on real-world stock market datasets, AlphaQCM outperformed existing methods, especially when dealing with large and complex dataset. This innovation could substantially improve how financial practitioners and researchers discover predictive and interpretable alphas.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/ZhuZhouFan/AlphaQCM

Primary Area: Applications->Everything Else

Keywords: Computational Finance, Formulaic Alpha, Distributional Reinforcement Learning, Quantiled Conditional Moments, Stock Trend Forecasting

Submission Number: 1805

Loading