Towards More Effective Statute Retrieval for Non-Professionals: A Comparative Study of Generation-Augmented Retrieval Strategies with LLMs

Published: 2025, Last Modified: 21 Jan 2026IJCNN 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Legal statute retrieval is essential for making legal information more accessible and promoting fairness in society, especially for non-professionals. The lack of specialized legal knowledge among non-professionals often results in poorly formulated queries, which poses a challenge for current retrieval models. The advent of Generation-Augmented Retrieval (GAR) with Large Language Models (LLMs) may offer a promising approach to solve this challenge by generating relevant query augmentations. Therefore, this paper first presents a comprehensive exploration of LLM-powered GAR for legal statute retrieval by non-professionals. We evaluate all three GAR strategies—query rewriting, statute rewriting, and keyword generation-enhanced strategy with both general and legal LLMs. Our findings reveal that, most original GAR strategies do not yield significant performance improvements, only the keyword generation-enhanced GAR strategy has potential to have performance gains. However, this strategy also needs to align with the linguistic style of legal statutes, which limits the current improvement in performance. To address this, we introduce Iterative-Alignment GAR (iGAR). It iteratively enhances keyword generation by using self-reward to construct the alignment dataset of linguistic style of legal statutes and using Direct Preference Optimization (DPO) to improve the generation of LLM. Our experimental results demonstrate that iGAR significantly improves the effectiveness of keyword generation-enhanced strategy for legal statute retrieval.
Loading