Different Approaches in Cross-Language Similar Documents Retrieval in the Legal Domain

Vladimir Zhebel, Denis Zubarev, Ilya Sochenkov

Published: 2020, Last Modified: 19 Mar 2026SPECOM 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The problem of cross-lingual information retrieval in the legal domain is up-to-date, because of the need of studying the best international practices to improve legislation. One of the possible solutions is thematically similar document retrieval. However, there is an important task to transfer between languages. The paper describes different approaches to solve this problem: from classical mediator-based methods to modern procedures of distributive semantics. As a test collection, we have used the UN digital library. The combination of the extended translation model and BM25 ranking function demonstrates the best results.
Loading