Keywords: Data Augmentation, Financial Sentiment Analysis, RAG, BERT, LLMs
TL;DR: Our work develops a framework that tries to mitigate temporal gap between old human annotated financial sentiment datasets and modern financial context using RAG, and finetuned classifier for robustness in augmented dataset.
Abstract: Static and outdated datasets hinder the accuracy of Financial Sentiment Analysis (FSA) in capturing rapidly evolving market sentiment. We tackle this by proposing a novel data augmentation technique using Retrieval Augmented Generation (RAG). Our method leverages a generative LLM to infuse established benchmarks with up-to-date contextual information from contemporary financial news. While this RAG-based augmentation significantly modernizes the data's alignment with current financial language, we employ a robust BERT-BiGRU judge model to ensure the sentiment of the original annotations is faithfully preserved. Crucially, FSA models trained on this enriched data exhibit enhanced performance on unseen test sets, demonstrating the practical value of our approach for developing more reliable and current sentiment classifiers.
Archival Status: Archival
Paper Length: Long Paper (up to 8 pages of content)
Submission Number: 353
Loading