Structuring News, Shaping Alpha: RL-Enhanced LLMs in a Hybrid Framework for Event Driven Financial Forcasting

Haohan Zhang; Saizhuo Wang; Hao Kong; Baozhu Shang

Structuring News, Shaping Alpha: RL-Enhanced LLMs in a Hybrid Framework for Event Driven Financial Forcasting

Haohan Zhang, Saizhuo Wang, Hao Kong, Baozhu Shang

Published: 21 Nov 2025, Last Modified: 14 Jan 2026GenAI in Finance PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: AI Finance, LLM Stock Prediction

Abstract: There has been an emergent field within AI-powered financial forecasting that leverages alternative data, particularly unstructured news and event information. Existing approaches often rely on fixed sentiment lexicons or manually defined event taxonomies, while recent advances in large language models (LLMs) have inspired the use of prompt engineering to structure such events into features for predictive modeling. However, such methods, though offering flexibility across modalities, fail to adapt to the constantly shifting dynamics of financial markets. Directly using human-annotated labels to guide adaptation is impractical, as annotation in financial domains are often not explicitly defined. How, then, can we align LLM event structuring with predictive objectives in a scalable and efficient way? In this work, we propose Structuring News, Shaping Alpha, a hybrid framework that integrates reinforcement learning–enhanced LLMs with ensemble-based forecasting models. Our system employs an LLM to re-classify financial events into structured categories, which are passed as features into a downstream ensemble predictor. Crucially, the LLM's classification policy is optimized in a closed-loop setting via Proximal Policy Optimization (PPO), where the reward derives not from human supervision but from the predictive value of the resulting features, measured through information coefficient (IC) against market returns. We argue that in domain tasks such as financial forecasting, the LLM’s strength lies in feature extraction, while the machine learning model excels at mapping structured features to numerical outputs. By combining these strengths, we advance a hybrid modeling paradigm in which LLMs and machine learning models each perform what they do best, yielding more adaptive and powerful event-driven prediction. Experiments on large-scale Chinese A-share stock data demonstrate that our RL-enhanced classifications yield a non-tricial information coefficient while consistently outperform carefully engineered prompt-only methods using a flagship LLM, yielding more adaptive and powerful event-driven prediction.

Submission Number: 107

Loading