Abstract: Large Language Models (LLMs) have emerged as versatile tools across various financial applications. However, their pre-training corpus introduces the risk of incorporating biases, potentially leading to unjust decisions when deployed in real-world scenarios. Understanding LLMs' implicit stock market preferences is vital for their responsible usage in financial applications. This paper investigates the stock preferences of five representative LLMs, including both commercial and open-source models such as the closed ChatGPT series, Llama, and Mistral, using our collected dataset covering over 4,000 tickers from both U.S. and Chinese stock markets. We employ carefully crafted preference prompts and calibration techniques to probe LLM biases, ensuring a reliable reflection of model preferences. Our investigation reveals significant biases among LLMs regarding different stock tickers, with a distinct preference for U.S. company stocks over Chinese companies. Additionally, LLMs demonstrate a clear preference for specific industries. To address these biases, we propose a mitigation method that enhances the fairness of LLMs through prompt engineering. Experimental results demonstrate that this method effectively corrects biases, showing significant improvements in model fairness. By shedding light on LLMs' stock preferences and offering a practical solution to mitigate biases, this study contributes to the responsible development and application of LLMs in financial domains.
Paper Type: Long
Research Area: Ethics, Bias, and Fairness
Research Area Keywords: model bias/fairness evaluation, model bias/unfairness mitigation, prompting
Contribution Types: Model analysis & interpretability
Languages Studied: English, Chinese
Submission Number: 1640
Loading