Reinforcement Learning with Argument-Structured Reward for Court Decision Abstractive Summarization

Yuntao Kong, Ye Xiong, Shuyuan Zheng, Ken Satoh

Published: 2025, Last Modified: 08 May 2026JURIX 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Court decision summarization is challenging due to the significant length and structural complexity of legal documents, which makes existing reinforcement learning (RL)-based abstractive summarization methods less effective. We propose an RL-based abstractive legal summarization model with a novel argument-structured reward mechanism. It leverages Issue–Reason–Conclusion components to compute fine-grained sub-rewards and aggregate them into a final reward. This design provides more reliable learning signals for model optimization. Experiments on the Indian Supreme Court dataset with Longformer-Encoder-Decoder (LED) and Llama demonstrate consistent improvements over non-argument-structured methods. To the best of our knowledge, this is the first work to incorporate argumentative structures into RL-based summarization, offering a novel direction for improving legal summarization.

External IDs:dblp:conf/jurix/KongXZS25