Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

Osma Suominen; Juho Inkinen; Mona Lehtinen

Annif at the GermEval-2025 LLMs4Subjects Task: Traditional XMTC Augmented by Efficient LLMs

Osma Suominen, Juho Inkinen, Mona Lehtinen

Published: 14 Aug 2025, Last Modified: 21 Aug 2025GermEval25 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: machine learning, subject indexing, artificial intelligence, optimisation

Paper Type: System Description Paper

Track: LLMs4Subjects

Abstract: This paper presents the Annif system in the LLMs4Subjects shared task (Subtask 2) at GermEval-2025. The task required creating subject predictions for bibliographic records using large language models, with a special focus on computational efficiency. Our system, based on the Annif automated subject indexing toolkit, refines our previous system from the first LLMs4Subjects shared task, which produced excellent results. We further improved the system by using many small and efficient language models for translation and synthetic data generation and by using LLMs for ranking candidate subjects. Our system ranked 1st in the overall quantitative evaluation of and 1st in the qualitative evaluation of Subtask 2.

Copyright Ransfer Agreement: pdf

Submission Number: 1

Loading