Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models

Nikolaos Giarelis; Charalampos Mastrokostas; Nikos I. Karacapilidis

Maistros: A Greek Large Language Model Adapted Through Knowledge Distillation From Large Reasoning Models

Nikolaos Giarelis, Charalampos Mastrokostas, Nikos I. Karacapilidis

Published: 02 Jun 2026, Last Modified: 02 Jun 2026Greeks in AI 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Natural Language Processing, Language and Learning, Greek Language, Question Answering, Knowledge Distillation, Empirical Research

Domains: Language and Learning, Other

TL;DR: We introduce Maistros 8B, a state-of-the-art open-weights Greek LLM fine-tuned on our CulturaQA, a high-quality synthetic and human curated dataset, to bridge the model performance gap in under-resourced Greek NLP tasks.

External Link: https://arxiv.org/abs/2605.01870

Abstract: Large Language Models (LLMs) have substantially advanced the field of Natural Language Processing (NLP), achieving state-of-the-art performance across a wide range of tasks. These improvements have been attributed, in part, to their emerging reasoning capabilities, which are enabled by large-scale training and increased model capacity. However, existing LLMs can generate erroneous responses when addressing complex queries that fall outside their training distribution, due to limited internal knowledge or the need for multi-step reasoning. To address these limitations, recent work has introduced large reasoning models (LRMs), which incorporate explicit internal reasoning processes to improve response accuracy. Additionally, state-of-the-art LRMs often comprise hundreds of billions of parameters and require several seconds per inference, even on advanced multi-GPU systems. These characteristics limit their practicality for deployment in conventional computing environments. Meanwhile, NLP research on multilingual LLMs continues to prioritize high-resource languages. However, these models exhibit limited performance in under-resourced languages, primarily due to insufficient language- and culture-specific training data. In this paper, we focus on Modern Greek, for which only a limited number of question answering (QA) datasets have been proposed, most of which are intended for model evaluation. To address this research gap in Greek QA, we make the following contributions: (i) We introduce CulturaQA, a high-quality LRM-generated and human-curated dataset, for Greek LLM training and evaluation; (ii) a memory-efficient LLM evaluation framework adaptable to diverse languages and QA tasks; (iii) Maistros 8B, a state-of-the-art open-weights Greek LLM developed via knowledge distillation and fine-tuning on CulturaQA; and (iv) a comprehensive evaluation of nine LLMs across nine human-curated Greek QA datasets. We release our code, model, and data to support reproducibility. Keywords: Large Language Models, Natural Language Processing, Language and Learning, Greek Language, Question Answering, Knowledge Distillation, Empirical Research

Submission Number: 33

Loading