Keywords: Financial Fact Checking, Efficient Fact Checking, Small LLMs
TL;DR: We introduce FISCAL, a framework that generates synthetic financial data to train MiniCheck-FISCAL, a compact fact-checker matching larger LLMs in accuracy while remaining efficient and cost-effective.
Abstract: Financial applications of large language models (LLMs) require factual reliability and computational efficiency, yet current systems often hallucinate details and rely on prohibitively large models. We propose \textsc{FISCAL} (Financial Synthetic Claim–Document Augmented Learning), a modular framework for generating synthetic data tailored to financial fact-checking. We first generate a dataset FISCAL-data using FISCAL. Then we train \textsc{MiniCheck-FISCAL}, a lightweight verifier for numerical financial claims. MiniCheck-FISCAL outperforms its baseline, surpasses GPT-3.5 Turbo and other open-source peers of similar size, and approaches the accuracy of much larger systems (20x) such as Mixtral-8×22B and Command R+. On external datasets FinDVer and Fin-Fact, it rivals GPT-4o and Claude-3.5 while outperforming Gemini-1.5 Flash, despite being an order of magnitude smaller. These results show that domain-specific synthetic data, combined with efficient fine-tuning, enables compact models to achieve state-of-the-art accuracy, robustness, and scalability for practical financial AI.
Submission Number: 36
Loading