Last Minute at the GermEval-2025 LLMs4Subjects Task: Few-Shot Contrastive Learning for Multilingual Multi-Label Classification
Keywords: multi-label classification, few-shot finetuning, contrastive learning
Paper Type: System Description Paper
TL;DR: We combine few-shot contrastive learning and multilingual embeddings to classify skewed bilingual technical records with minimal fine-tuning.
Track: LLMs4Subjects
Abstract: This paper addresses the challenges and recent advancements in automated domain classification for digital libraries, with a focus on integrating multilingual text embeddings to improve the classification of bilingual technical records. We employed the FastFit package to fine-tune a model capable of performing multi-label classification on the TIBKAT dataset, which presents a highly skewed distribution of subject categories. To enhance label interpretability, we generated detailed class descriptions using GPT-4-o Mini. Our method combines few-shot learning with a modified score calculation to ensure comparability across labels, enabling effective and scalable classification without extensive fine-tuning.
Submission Number: 4
Loading