It's the same but not the same: Do LLMs distinguish Spanish varieties?

Marina Mayor Rocher, Nina Melero, Cristina Pozo Huertas, GONZALO MARTINEZ RUIZ DE ARCAUTE, María Grandury, Pedro Reviriego

Published: 20 Aug 2025, Last Modified: 07 Jan 2026ZenodoEveryoneRevisionsCC BY-SA 4.0
Abstract: Spanish, spoken by over 600 million people, exhibits significant lexical, morphological, and syntactic diversity. Traditional benchmarks often overlook dialectal nuances, leading to biased assessments. This benchmark addresses the gap by focusing on dialectal variation and LLM performance in handling different Spanish dialects. The Spanish Dialect Benchmark dataset evaluates the ability of LLMs to distinguish and accurately use various Spanish dialects. It addresses the challenge of dialectal bias by presenting 31 multiple-choice questions reflecting regional linguistic variations. Examples: ¿Cuál suena más natural? a. «Llegas tarde, vístete y corre». (Peninsular, Chilean Spanish) b. «Llegas tarde, vístete y córrele». (Antillean, Mexican Spanish) ¿Qué verbo usas para describir la acción de ponerse de pie? a. levantarse (Rioplatense, Peninsular Spanish) b. pararse (Antillean, Mexican Spanish)
Loading