Ifqa-llm: intelligent intention-driven financial question-answering with large language models

Fangshu Chen, Yilin Huang, Jiahui Wang, Chengcheng Yu, Xiankai Meng

Published: 01 Jan 2025, Last Modified: 04 Oct 2025J. Supercomput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The successful application of large language models (LLMs) has expanded the development prospects of various fields, but the application of knowledge base question-answering (KBQA) in the financial field still suffers from insufficient recognition of specific professional question intentions and inadequate processing of multi-knowledge point aggregation, often leading to poor answer quality. This paper proposes an Intelligent Intention-driven Financial Question-Answering with Large Language Models (IFQA-LLM) framework, which innovatively integrates three modules to address these challenges. The intent processing module leverages data augmentation and chain-of-thought (CoT) prompting to iteratively clarify ambiguous intents through multi-round dialog, correcting grammatical errors, standardizing terminology (e.g., converting “PE ratio” to “Price Earnings Ratio”), and unifying temporal references, which surpasses traditional keyword-matching methods by enabling structured intent extraction. The information retrieval module employs a “multi-route retrieval” approach that combines vector-based recall (using a fine-tuned embedding model for semantic search, which captures contextual meaning beyond literal keywords) and inverted index-based recall (via BM25 scoring for keyword precision), merging results through the reciprocal rank fusion (RRF) algorithm to enhance accuracy in multi-knowledge aggregation scenarios. The interactive learning module utilizes “client ratings” (0–4 scores on answer accuracy and conciseness) to dynamically optimize retrieval thresholds and prompt engineering, with feedback below score 3 marked as difficult samples for embedding model fine-tuning. Compared with current LLM-based KBQA methods, IFQA-LLM is novel in its intent-first processing paradigm, hybrid multi-route retrieval strategy balancing semantic understanding and keyword matching, and feedback-driven closed-loop optimization, enabling it to handle complex queries, support local deployment for information security, and achieve an average prediction accuracy of 91% on real financial datasets, outperforming state-of-the-art models.

External IDs:dblp:journals/tjs/ChenHWYM25