TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context
Abstract: In the legal domain, Fact-based Judgment Prediction and Explanation (FJPE) aims to predict judicial outcomes and generate grounded explanations using only factual information, mirroring early-phase legal reasoning. Motivated by the overwhelming case backlog in the Indian judiciary, we introduce **TathyaNyaya**, the first large-scale, expert-annotated dataset for FJPE in the Indian context. Covering judgments from the Supreme Court and multiple High Courts, the dataset comprises four complementary components, **NyayaFacts**, **NyayaScrape**, **NyayaSimplify**, and **NyayaFilter**, that facilitate diverse factual modeling strategies. Alongside, we present **FactLegalLlama**, an instruction-tuned LLaMa-3-8B model fine-tuned to generate faithful, fact-grounded explanations. While FactLegalLlama trails transformer baselines in raw prediction accuracy, it excels in generating interpretable explanations, as validated by both automatic metrics and legal expert evaluation. Our findings show that fact-only inputs and preprocessing techniques like text simplification and fact filtering can improve both interpretability and predictive performance. Together, TathyaNyaya and FactLegalLlama establish a robust foundation for realistic, transparent, and trustworthy AI applications in the Indian legal system.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Fact based Judgment Prediction, Legal AI, Explanation Generation, Indian Legal System, Legal Natural Language Processing, Annotated Legal Datasets, Large Language Models, Interpretability in AI, Automated Legal Analysis
Contribution Types: Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data resources, Data analysis, Position papers
Languages Studied: English
Submission Number: 6074
Loading