Does LLM Quantization solve the Low Resource Double Bind for Urdu?

Does LLM Quantization solve the Low Resource Double Bind for Urdu?

ACL ARR 2026 January Submission10897 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: benchmarking, automatic creation and evaluation of language resources, automatic evaluation of datasets, evaluation methodologies, evaluation, metrics

Abstract: Many communities across the globe face a “low-resource double bind”: limited computing power and scarce local-language data for local LLM development. Model compression techniques such as quantization are proposed as a solution, but the performance of quantized LLMs on low resource languages - especially non-Latin scripts like Urdu - remains underexplored. In this study we evaluate the performance of eight quantized small LLMs - including Gemma3, LLama3.1 and GPT-oss - on eight specific Urdu generation and classification tasks. While GPT-oss leads in classification and Llama3.1 dominates in generation, we find all models exhibiting highly volatile performance and falling short of benchmarks across tasks. We further trace these results to the models’ training datasets and architectures, and find that smaller models like Gemma3 can perform at par with models 10 times their size when pretrained on multilingual corpora. Finally, we present a quality evaluation of Urdu benchmarking datasets and scores, highlighting critical flaws and suggesting design principles for more faithful evaluations. Our study demonstrates that LLM quantizations can risk creating second-tier AI for low-resource communities in the absence of authentic evaluations on individual low-resource languages.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, automatic creation and evaluation of language resources, automatic evaluation of datasets, evaluation methodologies, evaluation, metrics

Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency

Languages Studied: Urdu

Submission Number: 10897

Loading