Abstract: Retrieval-Augmented Generation (RAG) plays a vital role in the financial domain, with widespread applications in areas such as real-time market analysis, trend analysis, and interest rate calculation. However, most existing RAG research in finance focuses predominantly on textual data, neglecting the rich visual information embedded in financial documents, causing a significant loss of valuable insights of financial analysis. Therefore, considering the characteristics of the financial domain, where accurate and high-quality multimodal retrieval is critical, we carefully design the FinRAGCiteBench-V, a vision-based RAG benchmark in financial domain, including (1) a bilingual retrieval corpus with 60,780 Chinese pages and 51,219 English pages from varieties of real-world documents; (2) a diversed bilingual financial dataset for evaluating LLMs' generation, covering seven different question categories; (3) a baseline RGenCite covering from retrieval to generation and vision-based citation. With comprehensive experiments on RGenCite, we can validate the benchmark's robustness and diversity, providing valuable insights for multimodal RAG systems in the financial domain.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Benchmarking, Retrieval Augmented Generation, Multimodal, Citation
Contribution Types: Data resources
Languages Studied: English, Chinese
Submission Number: 6698
Loading