NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

NativQA: Multilingual Culturally-Aligned Natural Query for LLMs

ACL ARR 2025 February Submission826 Authors

11 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Natural Question Answering (QA) datasets play a crucial role in evaluating the capabilities of large language models (LLMs), ensuring their effectiveness in real-world applications. Despite the numerous QA datasets that have been developed and some work has been done in parallel, there is a notable lack of a framework and large scale region-specific datasets queried by native users in their own languages. This gap hinders the effective benchmarking and the development of fine-tuned models for regional and cultural specificities. In this study, we propose a scalable, language-independent framework, NativQA, to seamlessly construct culturally and regionally aligned QA datasets in native languages, for LLM evaluation and tuning. We demonstrate the efficacy of the proposed framework by designing a multilingual natural QA dataset, MultiNativQA, consisting of ~64k manually annotated QA pairs in seven languages, ranging from high to extremely low resource, based on queries from native speakers from 9 regions covering 18 topics. We benchmark open- and closed-source LLMs with the MultiNativQA dataset. We made the framework NativQA, MultiNativQA dataset, and other experimental scripts publicly available for the community (https://anonymous.com/).

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: resources for less-resourced languages, multilingual benchmarks, multilingual corpora, NLP datasets, datasets for low resource languages

Contribution Types: Approaches to low-resource settings, Data resources

Languages Studied: Arabic, Assamese, Bangla, English, Hindi, Nepali, Turkish

Submission Number: 826

Loading