Submission Type: Regular Short Paper
Submission Track: NLP Applications
Submission Track 2: Interpretability, Interactivity, and Analysis of Models for NLP
Keywords: Financial Natural Language Processing, Large Language Models, ChatGPT
Abstract: The emergence of Large Language Models (LLMs), such as ChatGPT, has revolutionized general natural language preprocessing (NLP) tasks. However, their expertise in the financial domain lacks a comprehensive evaluation. To assess the ability of LLMs to solve financial NLP tasks, we present FinLMEval, a framework for Financial Language Model Evaluation, comprising nine datasets designed to evaluate the performance of language models.
This study compares the performance of fine-tuned auto-encoding language models (BERT, RoBERTa, FinBERT) and the LLM ChatGPT. Our findings reveal that while ChatGPT demonstrates notable performance across most financial tasks, it generally lags behind the fine-tuned expert models, especially when dealing with proprietary datasets. We hope this study builds foundation evaluation benchmarks for continuing efforts to build more advanced LLMs in the financial domain.
Submission Number: 2825
Loading