Keywords: Large Language Models, Retrieval-Augmented Generation, Legal Reasoning, Hallucination Mitigation
Abstract: With the growing potential of large language models (LLMs) in the legal domain, an increasing number of specialized legal models are being developed and introduced. Among them, domain-specific finetuning and retrieval-augmented generation (RAG) methods have received widespread attention. However, there are still some drawbacks such as hallucinated citations and limited explainability. To address these challenges, we construct AALawyer, a generative retrieval-augmented LLM system for legal reasoning in the criminal law domain, and Hallucination Risk-Benchmark, a new benchmark designed for evaluating RAG-based models. Our AALawyer consists of a domain-specific legal LLM named AA-LeLLM and two retrieval modules named AC-RAG and CCs-RAG. Different from both traditional RAG commonly used in legal LLMs and other new RAG, we propose a novel generative RAG, AC-RAG, and construct a CCs-RAG with new criminal cases for retrieval.
Experiments demonstrate the professionalism and small hallucination of AALawyer in real-world cases. The model reaches the state-of-the-art level on LawBench classification tasks and scores $88.84\\%$ (improve $71.98\\%$) on our target classification task FAP. On the Hallucination Risk-Benchmark, AALawyer outperforms the base model, reducing $37.6\\%$ hallucination risk and finally the average score is improved by $31.7\\%$.
Primary Area: generative models
Submission Number: 23813
Loading