Intermediate Hidden Layers for Legal Case Retrieval Representation

Published: 01 Jan 2024, Last Modified: 16 May 2025DEXA (2) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the world of intelligent legal systems, the process of finding relevant case documents related to a specific legal matter is known as legal case retrieval. While Pretrained Language Models (PLMs) have demonstrated impressive performance in various information retrieval tasks, the challenge of developing effective strategies for legal case retrieval remains unresolved. This is mainly because Legal Case Documents often consist of lengthy and disorganized text, which sets them apart from general documents. Consequently, the PLMs may not fully capture the intricate legal details present in legal case documents. This is because most language models struggle to grasp the extensive connections between various structures in the text. To address these challenges, we propose an approach that utilizes outputs from multiple intermediate layers of pre-trained transformer models, while considering the entire legal case document. Our primary objective is to generate more accurate and meaningful embedding representations, enabling the identification of semantic and syntactic relationships among sentences in the document. This approach also aims to preserve the contextual interactions with adjacent sentences, thereby enhancing the overall understanding of the document’s content. The performance of our proposed approach is showcased through experimental results conducted on publicly accessible legal datasets.
Loading