LawShift: Benchmarking Legal Judgment Prediction Under Statute Shifts

Published: 18 Sept 2025, Last Modified: 03 Nov 2025NeurIPS 2025 Datasets and Benchmarks Track posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Legal Judgment Prediction; AI for Law; Legal Benchmark
Abstract: Legal Judgment Prediction (LJP) seeks to predict case outcomes given available case information, offering practical value for both legal professionals and laypersons. However, a key limitation of existing LJP models is their limited adaptability to statutory revisions. Current SOTA models are neither designed nor evaluated for statutory revisions. To bridge this gap, we introduce LawShift, a benchmark dataset for evaluating LJP under statutory revisions. Covering 31 fine-grained change types, LawShift enables systematic assessment of SOTA models' ability to handle legal changes. We evaluate five representative SOTA models on LawShift, uncovering significant limitations in their response to legal updates. Our findings show that model architecture plays a critical role in adaptability, offering actionable insights and guiding future research on LJP in dynamic legal contexts.
Croissant File: json
Dataset URL: https://huggingface.co/datasets/triangularPeach/LawShift/tree/main
Code URL: https://github.com/triangularPeach/LawShift
Primary Area: AL/ML Datasets & Benchmarks for social sciences (e.g. climate, health, life sciences, physics, social sciences)
Flagged For Ethics Review: true
Submission Number: 1685
Loading