SimpleText Best of Labs in CLEF-2024: Application of Large Language Models for Scientific Text Simplification

Published: 2025, Last Modified: 15 Jan 2026CLEF 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Accessing and understanding scientific literature remains difficult due to its inherent complexity, characterized by specialized terminology and reliance on readers’ prior knowledge. The CLEF 2024 SimpleText lab aims to improve accessibility to scientific information for a broader audience through advancements in information retrieval and natural language processing. This paper explores the application of large language models for three scientific text simplification tasks. The first task involves retrieving passages to include in a simplified summary. The proposed systems select candidates using TF-IDF with expanded queries via LLaMA3, followed by a bi-encoder, cross-encoder, and LLaMA3-based re-ranking. In the second task, the goal is to identify and explain difficult concepts. Three models are explored for this task using LLaMA3 and Mistral. Finally, for Task 3, which focuses on simplifying scientific text, similar to Task 2, LLaMA3 and Mistral are used with different prompting and fine-tuning approaches. The experimental results show that the proposed systems in Task 1 are the most effective, and for Tasks 2 and 3 are comparable with other systems proposed in the SimpleText lab.
Loading