Keywords: Financial LLM; Asset Pricing; Stock Pric Prediction
Abstract: We propose a new framework for topic modeling of financial news, integrating large language models (LLMs) with document embeddings, clustering, and GPT-4-based topic refinement. Our method filters and processes Dow Jones financial articles, embeds them using OpenAI models, applies HDBSCAN+UMAP for clustering, and uses GPT-4 to generate, deduplicate and refine topic descriptions. Our model generates higher quality and more stable topics than conventional topic models such as LDA and BERTopic, and substantially improves forecasting accuracy. The generated topics are highly interpretable and distinct, contain rich information for the state of economy, and have high predictive power for macroeconomic and stock market performance. This study represents the first application of GPT-4-assisted clustering refinement to topic modeling and financial forecasts.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 5321
Loading