Adapting LLMs for Domain-Specific Retrieval: A Case Study in Nuclear Safety

Federico Borazio, Danilo Croce, Roberto Basili

Published: 2025, Last Modified: 05 Feb 2026ECIR (5) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Large Language Models (LLMs) have revolutionized IR by improving query formulation, aggregating results, and generating intuitive summaries. However, their generalist training often limits their effectiveness in specialized domains, where precise terminology, relationships, and context are critical. To address this, we propose a simple yet effective method to adapt LLMs for domain-specific tasks by leveraging structured knowledge bases through a technique called “textification”. This approach transforms domain knowledge, such as glossaries and ontologies, into synthetic textual definitions or question-answer pairs, useful for fine-tuning an LLM to internalize specialized concepts, definitions, and semantic relationships. We demonstrate the impact of this approach with a case study in the nuclear safety domain, leveraging the International Atomic Energy Agency (IAEA) Safety Glossary to fine-tune an LLM. The adapted model consistently improves upon off-the-shelf versions in tasks like concept recognition, definition generation, and classification of ontological relationships. These results highlight the potential of domain-adapted LLMs to enhance retrieval-augmented systems, enabling more accurate and contextually relevant applications.

External IDs:dblp:conf/ecir/BorazioCB25