InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes Under Herd Behavior

InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes Under Herd Behavior

ACL ARR 2025 February Submission7738 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Aligning Large Language Models (LLMs) with investor decision-making processes under herd behavior is a critical challenge in behavioral finance, which grapples with a fundamental limitation: the scarcity of real-user data needed for Supervised Fine-Tuning (SFT). While SFT can bridge the gap between LLM outputs and human behavioral patterns, its reliance on massive authentic data imposes substantial collection costs and privacy risks. We propose **InvestAlign**, a novel framework that constructs high-quality SFT datasets by leveraging theoretical solutions to similar and simpler optimal investment problems rather than complex scenarios. Our theoretical analysis demonstrates that training LLMs with **InvestAlign**-generated data achieves faster parameter convergence than using real-user data, suggesting superior learning efficiency. Furthermore, we develop **InvestAgent**, an LLM agent fine-tuned with **InvestAlign**, which demonstrates significantly closer alignment to real-user data than pre-SFT models in both simpler and complex investment problems. This highlights **InvestAlign** as a promising approach with the potential to address complex optimal investment problems and align LLMs with investor decision-making processes under herd behavior.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: data-efficient training, data augmentation, fine-tuning, human behavior analysis

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Approaches low compute settings-efficiency, Theory

Languages Studied: English

Submission Number: 7738

Loading