FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain

FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain

ACL ARR 2025 May Submission7185 Authors

20 May 2025 (modified: 29 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent LLMs have demonstrated promising ability in solving finance related problems. However, applying LLMs in real-world finance application remains challenging due to its high risk and high stakes property. This paper introduces FinTrust, a comprehensive benchmark specifically designed for evaluating the trustworthiness of LLMs in finance applications. Our benchmark focuses on a wide range alignment issues based on practical context and features fine-grained tasks for each dimension of trustworthiness evaluation. We assess eight LLMs on FinTrust and find that proprietary models like GPT-4.1 outperforms in many tasks such as trustfulness while open-source models like DeepSeek-V3 have advantage in specific areas like industry-level fairness. For challenging task like fiduciary alignment and disclosure, all LLMs are not satisfying, showing a significant gap in the legal awareness of LLMs. We believe that FinTrust can be a valuable benchmark for LLMs' trustworthiness evaluation in finance domain.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Resources and Evaluation

Contribution Types: Model analysis & interpretability, Data resources

Languages Studied: English, Hausa

Submission Number: 7185

Loading