Language, Artificial Intelligence, and Digital Equity in North East India: A Quantitative and Policy Analysis of Low-Resource Language Exclusion, Educational Infrastructure, and Climate Knowledge System
Keywords: Northeast India, low-resource languages, artificial intelligence, natural language processing, Bhashini, UDISE+, tribal education, indigenous ecological knowledge, climate resilience, digital equity
TL;DR: NE India's 220+ languages are excluded from AI systems; scheduled-language states have 19pp higher school internet access; a three-pillar Language-Education-Climate AI framework is proposed.
Abstract: North East India is home to over 220 tribal and indigenous languages, with a rich cultural heritage. Yet, it is systematically underrepresented in the data architectures that underpin artificial intelligence (AI), its languages are low-resource, its schools are digitally under-equipped, and its indigenous ecological knowledge exists almost entirely in oral, non-digitized forms. Drawing on quantitative analysis of UDISE+ 2024-25 school education data, Census 2011 language statistics, natural language processing (NLP) research from the WMT 2024 Low-Resource Indic Language Translation Shared Task, and policy analysis of the National Education Policy 2020, the Bhashini platform, and the IndiaAI Mission 2024, this paper presents six empirically grounded findings. Most importantly, the states where the dominant language is constitutionally scheduled average 19 percentage points higher internet access in schools than states where the dominant language is non-scheduled, a structural relationship between language recognition and digital access that has not been previously quantified at the sub-national level. The paper proposes a three-pillar integrated framework for Language AI, Education, and Climate Resilience and argues that investment in AI systems designed for community languages, rather than high-resource languages for local deployment, is an important factor for equitable digital development in the region.
Submission Number: 9
Loading