It's the same but not the same: Do LLMs distinguish Spanish varieties?
Abstract: Spanish, spoken by over 600 million people, exhibits significant lexical, morphological, and syntactic diversity. Traditional benchmarks often overlook dialectal nuances, leading to biased assessments. This benchmark addresses the gap by focusing on dialectal variation and LLM performance in handling different Spanish dialects. The Spanish Dialect Benchmark dataset evaluates the ability of LLMs to distinguish and accurately use various Spanish dialects. It addresses the challenge of dialectal bias by presenting 31 multiple-choice questions reflecting regional linguistic variations. Examples: ¿Cuál suena más natural? a. «Llegas tarde, vístete y corre». (Peninsular, Chilean Spanish) b. «Llegas tarde, vístete y córrele». (Antillean, Mexican Spanish) ¿Qué verbo usas para describir la acción de ponerse de pie? a. levantarse (Rioplatense, Peninsular Spanish) b. pararse (Antillean, Mexican Spanish)
External IDs:doi:10.5281/zenodo.15101402
Loading