TathyaNyaya and FactLegalLlama: Advancing Factual Judgment Prediction and Explanation in the Indian Legal Context

ACL ARR 2025 July Submission343 Authors

27 Jul 2025 (modified: 25 Aug 2025)ACL ARR 2025 July SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In the legal domain, Fact-based Judgment Prediction and Explanation (FJPE) aims to predict judicial outcomes and generate grounded explanations using only factual information, mirroring early-phase legal reasoning. Motivated by the overwhelming case backlog in the Indian judiciary, we introduce TathyaNyaya, the first large-scale, expert-annotated dataset for FJPE in the Indian context. Covering judgments from the Supreme Court and multiple High Courts, the dataset comprises four complementary components, NyayaFacts, NyayaScrape, NyayaSimplify, and NyayaFilter, that facilitate diverse factual modeling strategies. Alongside, we present FactLegalLlama, an instruction-tuned LLaMa-3-8B model fine-tuned to generate faithful, fact-grounded explanations. While FactLegalLlama trails transformer baselines in raw prediction accuracy, it excels in generating interpretable explanations, as validated by both automatic metrics and legal expert evaluation. Our findings show that fact-only inputs and preprocessing techniques like text simplification and fact filtering can improve both interpretability and predictive performance. Together, TathyaNyaya and FactLegalLlama establish a robust foundation for realistic, transparent, and trustworthy AI applications in the Indian legal system.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: Fact based Judgment Prediction, Legal AI, Explanation Generation, Indian Legal System, Legal Natural Language Processing, Annotated Legal Datasets, Large Language Models, Interpretability in AI, Automated Legal Analysis
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Approaches low compute settings-efficiency, Publicly available software and/or pre-trained models, Data resources, Data analysis, Position papers
Languages Studied: English
Previous URL: https://openreview.net/forum?id=lXV2Be4b6b
Explanation Of Revisions PDF: pdf
Reassignment Request Area Chair: Yes, I want a different area chair for our submission
Reassignment Request Reviewers: Yes, I want a different set of reviewers
Justification For Not Keeping Action Editor Or Reviewers: We sincerely appreciate the valuable feedback and efforts of the previous action editor and reviewers. However, for this new submission, we have made substantial changes to the paper, including incorporating expert evaluations, conducting inter-annotator agreement analysis, adding new experimental results and ablation studies, and significantly reorganizing and rewriting parts of the manuscript. Given the extent of these changes, we believe a fresh perspective from a new action editor and reviewers may lead to a more balanced and updated evaluation of the revised version. This decision is made with due respect to the prior reviewers and to ensure that the revised paper is assessed on its current merits.
Software: zip
Data: zip
A1 Limitations Section: This paper has a limitations section.
A2 Potential Risks: N/A
B Use Or Create Scientific Artifacts: Yes
B1 Cite Creators Of Artifacts: Yes
B1 Elaboration: Subsection 4.1 Dataset Compilation and Statistics
B2 Discuss The License For Artifacts: No
B2 Elaboration: We will release the Dataset, code and models after acceptance of the paper.
B3 Artifact Use Consistent With Intended Use: No
B3 Elaboration: We intend to release the dataset for research.
B4 Data Contains Personally Identifying Info Or Offensive Content: N/A
B5 Documentation Of Artifacts: N/A
B6 Statistics For Data: Yes
B6 Elaboration: Section 4 Dataset
C Computational Experiments: Yes
C1 Model Size And Budget: Yes
C1 Elaboration: Appendix section A Experimental Setup and Hyper-parameters
C2 Experimental Setup And Hyperparameters: Yes
C2 Elaboration: Appendix section A Experimental Setup and Hyper-parameters
C3 Descriptive Statistics: Yes
C3 Elaboration: Section 7 Results and Analysis
C4 Parameters For Packages: Yes
C4 Elaboration: Appendix section A Experimental Setup and Hyper-parameters
D Human Subjects Including Annotators: Yes
D1 Instructions Given To Participants: Yes
D1 Elaboration: Subsection 4.2 Annotation Methodology and Quality Assurance
D2 Recruitment And Payment: Yes
D2 Elaboration: We assigned the annotation work to students as part of their academic assignments. No additional monetary payment was provided, as this task was integrated into their coursework and aligned with their academic learning objectives.
D3 Data Consent: No
D3 Elaboration: Openly available dataset
D4 Ethics Review Board Approval: N/A
D5 Characteristics Of Annotators: Yes
D5 Elaboration: Subsection 4.2 Annotation Methodology and Quality Assurance
E Ai Assistants In Research Or Writing: No
E1 Information About Use Of Ai Assistants: N/A
Author Submission Checklist: yes
Submission Number: 343
Loading