AuditCopilot: Leveraging LLMs for Fraud Detection in Double-Entry Bookkeeping

Published: 21 Nov 2025, Last Modified: 14 Jan 2026GenAI in Finance PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Auditing, Financial auditing, Journal Entry Testing
TL;DR: AuditCopilot shows that prompt-engineered LLMs, combined with Isolation Forest, outperform traditional audit tests by delivering accurate anomaly detection with interpretable rationales
Abstract: Auditors rely on Journal Entry Tests (JETs) to detect anomalies in tax-related ledger records, but rule-based methods generate overwhelming false positives and struggle with subtle irregularities. We investigate whether large language models (LLMs) can serve as anomaly detectors in double-entry bookkeeping. Benchmarking SoTA LLMs such as LLaMA and Gemma on both synthetic and real-world anonymized ledgers, we compare them against JETs and machine learning baselines. Our results show that LLMs consistently outperform traditional rule-based JETs and classical ML baselines, while also providing natural-language explanations that enhance interpretability. These results highlight the potential of \textbf{AI-augmented auditing}, where human auditors collaborate with foundation models to strengthen financial integrity.
Submission Number: 148
Loading