Aligning News and Prices: A Cross-Modal LLM-Enhanced Transformer DRL Framework for Volatility-Adaptive Stock Trading

Aligning News and Prices: A Cross-Modal LLM-Enhanced Transformer DRL Framework for Volatility-Adaptive Stock Trading

ICLR 2026 Conference Submission9001 Authors

17 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Model, Automated Stock Trading, Multimodal Fusion

Abstract: While Deep Reinforcement Learning (DRL) has shown promise for stock trading, its practical application is constrained by critical gaps that undermine performance in real-world volatile markets, most notably during events like the 2020 COVID-19 market crash. Specifically, existing DRL methods fail to capitalize on textual financial news (a key leading indicator of market sentiment), struggle to model multi-scale temporal dynamics, and lack robustness to extreme volatility, leaving them unable to adapt to sudden shifts in market fundamentals. To address these limitations, we propose a volatility-adaptive, multimodal DRL framework for stock trading integrating pre-trained Large Language Models (LLMs), Transformers, and the Soft Actor-Critic (SAC) algorithm. The framework first uses an LLM-driven module to extract sentiment and event features from financial news, maps price dynamics into the LLM’s semantic space via a multi-head attention reprogramming layer, and fuses these modalities via cross-attention to capture intrinsic news-price interdependencies. To enhance state representation, a Transformer encoder models short/long-term news sentiment trends, price fluctuations, and inter-stock correlations, and merges these heterogeneous features into a compact, unified state via multi-head attention. Finally, we incorporate gradient feedback from SAC’s critic network to the Transformer, enabling end-to-end optimization of feature learning and trading policy. Empirical evaluations on NASDAQ-100 data show our framework outperforms existing DRL methods in multi-stock trading, while surpassing Transformer-based methods in single-stock prediction, with ablations confirming core modules drive performance gains.

Primary Area: reinforcement learning

Submission Number: 9001

Loading