Keywords: Chain-of-Thought, Reasoning Boundary
TL;DR: We propose a simple but effective method to improve the legal reasoning boundary of LLMs.
Abstract: Large language models (LLMs) have made significant strides in legal
Domain, such as recommending charge based on the given criminal fact. However, existing LLMs struggle to address the long-tail problem in the legal domain. This is primarily due to the imbalanced
distribution of legal data during pre-training, resulting in varied inference capabilities, especially those in the long-tail category. Moreover, common methods for enhancing LLMs’ reasoning abilities, such as the chain-of-thought
series and retrieval-augmented generation series, also fail to address
the long-tail charge prediction problem. In this work, we reveal that
solving this issue requires providing LLMs with their deficient legal
knowledge, particularly regarding easily confused and tailed legal concepts. We propose a legal knowledge retrieval method (denoted as KnowTail) that includes a Knowledge Inspector to identify their
knowledge gaps and a Knowledge Integrator to provide tailored legal knowledge for accurate legal reasoning. Extensive experiments on real-world datasets demonstrate that our method significantly surpasses prior state-of-the-art methods, e.g., achieving average F1 gains of 22.91% for overall charges and 22.94% for tail charges on GPT-4o, and gains of 26.74% (overall) and 27.16% (tail) on QWen3-
8B. Our codes and data are available at Github: https://anonymous.4open.science/r/KnowTail-F860.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 17582
Loading