MLCU: A Dataset for Evaluating Domain-Specific Language Understanding in LLMs

MLCU: A Dataset for Evaluating Domain-Specific Language Understanding in LLMs

ACL ARR 2026 January Submission8436 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: NLP, Military Language, AI, ML

Abstract: Modern Natural Language Processing systems are typically trained on large general purpose corpora where domain specific language is often underrepresented and relegated secondary to fine-tuning. This omission limits the effectiveness of language models in critical applications, such as for military specific functions. This paper presents the development and analysis of MLCU, the Military Language Comprehension and Understanding dataset, which is a novel comprehensive domain specific language dataset tailored for military speech detection. We introduce a comprehensive curated lexicon and example phrases in English drawn from authentic and representative military communications for the basis of benchmarking and annotation of contextual military term disambiguation. Evaluation of our dataset on Gemini 2.5 Flash and Llama 3.1-8B yielded accuracies of 87.6\% and 80.8\% respectively, indicating that state of the art LLMs have difficulties with disambiguating language when used in a nonstandard manner such as military language. This work highlights the limitations of existing models and the room for improvement of domain specific language in NLP.

Paper Type: Short

Research Area: Semantics: Lexical, Sentence-level Semantics, Textual Inference and Other areas

Research Area Keywords: Resources and Evaluation, Semantics: Lexical and Sentence-Level

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 8436

Loading