LegEx: Dataset for Legal Case Retrieval Based on Negation and Exclusion Conditions

LegEx: Dataset for Legal Case Retrieval Based on Negation and Exclusion Conditions

ACL ARR 2025 July Submission1452 Authors

29 Jul 2025 (modified: 27 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In the legal domain, queries involving negation or exclusion conditions such as "not having done ~" or "excluding ~" frequently arise in legal case retrieval. However, existing studies rarely address such expressions systematically. To bridge this gap, this study constructs a dataset explicitly tailored for legal case retrieval based on negation and exclusion conditions, consisting of queries, corresponding relevant cases, and challenging negative examples. This work also experimentally evaluates the limitations of existing information retrieval models and the performance improvements achieved through fine-tuning in case retrieval given such conditions. Experimental results demonstrate that pretrained information retrieval models initially fail to properly handle negation and exclusion expressions, whereas their ability to respond to these conditions significantly improves after fine-tuning. By introducing a specialized dataset for negation and exclusion queries in the previously unexplored legal domain, this study highlights the limitations of current retrieval models and validates that a dataset-driven approach can effectively overcome these challenges.

Paper Type: Long

Research Area: Information Retrieval and Text Mining

Research Area Keywords: Negation and exclusion conditions, Law case retrieval, Dataset, Information retrieval, Embedding model, Fine-tuning, Legal domain

Contribution Types: Data resources

Languages Studied: Korean

Submission Number: 1452

Loading