Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models

Kelvin J.L. Koa; Yunshan Ma; Ritchie Ng; Tat-Seng Chua

Learning to Generate Explainable Stock Predictions using Self-Reflective Large Language Models

Kelvin J.L. Koa, Yunshan Ma, Ritchie Ng, Tat-Seng Chua

Published: 23 Jan 2024, Last Modified: 23 May 2024TheWebConf24EveryoneRevisionsBibTeX

Keywords: computational finance, stock prediction, large language models, explainable AI, self-reflective

Abstract: In this work, we design a Large Language Model (LLM) based framework to generate explainable next-day stock predictions from web-mined social texts. Explaining stock predictions is generally a difficult task for traditional non-generative deep learning models, where explanations are limited to visualizing the attention weights on important texts. Today, LLMs present a solution to this problem, given their known capabilities to generate human-readable explanations for their decision-making process. However, the task of stock prediction remains challenging for LLMs, as it requires the ability to weigh the varying impacts of chaotic social texts on stock prices. The problem gets progressively harder with the introduction of the explanation component, which requires LLMs to explain verbally why certain factors are more important than others. On the other hand, to fine-tune LLMs for such a task, one would need expert-annotated samples of explanation for every stock movement in the training set, which is expensive and impractical to scale. To tackle these issues, we propose a training framework that utilizes a verbal self-reflective agent and Proximal Policy Optimization (PPO), which allows a LLM to teach itself how to generate explainable stock predictions in a fully autonomous manner. The reflective agent allows the LLM to learn how to explain past stock movements through a self-reasoning process, while the PPO trainer trains the model to generate the most likely explanations given the input texts. The training samples for the PPO trainer are the responses generated during the reflective process, which eliminates the need for human annotators. Using our Summarize-Explain-Predict (SEP) framework, we fine-tune a LLM that outperforms traditional deep-learning methods and pre-trained LLMs in prediction accuracy and Matthews correlation coefficient (MCC) for the stock classification task. To justify the generalization capability of the SEP framework, we further test it on the portfolio-making task, and demonstrate its effectiveness through portfolio metrics such as its Sharpe Ratio.

Track: Web Mining and Content Analysis

Submission Guidelines Scope: Yes

Submission Guidelines Blind: Yes

Submission Guidelines Format: Yes

Submission Guidelines Limit: Yes

Submission Guidelines Authorship: Yes

Student Author: Yes

Submission Number: 1792

Loading