Extract-Explain-Abstract: A Rhetorical Role-Driven Domain-Specific Summarisation Framework for Indian Legal Documents
Abstract: Legal documents are characterized by their length, intricacy, and dense use of jargon, making efficacious summarisation both paramount and challenging. This paper introduces the Rhetorical Role-based Extract-Explain-Abstract (EEA) Framework, a novel three-stage methodology for summarisation of Indian legal documents in low-resource settings. The approach begins by segmenting legal texts using rhetorical roles, such as facts, issues and arguments, through a domain-specific phrase corpus and extraction based on TF-IDF. In the explanation stage, the segmented output is enriched with logical connections, leveraging rhetorical structure theory to ensure coherence and legal fidelity. The final abstraction phase condenses these interlinked segments into cogent, high-level summaries that preserve critical legal reasoning. We focus primarily on small language models (SLMs) because they can be efficiently deployed on local GPUs for cost-effective fine-tuning on specific legal domains or drafting styles. Experiments on Indian legal datasets show that the EEA framework typically outperforms in ROUGE, BERT scores and human evaluations. We also employ InLegalBERT score as a metric to capture domain specific semantics of Indian legal documents.
Paper Type: Short
Research Area: Summarization
Research Area Keywords: extractive summarisation, abstractive summarisation, long-form summarization, sentence compression, evaluation
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2871
Loading