Abstract: Large language models(LLMs) open up new possibilities for Low-resource languages(LRLs) by showing impressive results in various classification and generation tasks with zero/few-shot inference. Due to pre-training from large datasets, including web-based corpora, these LLMs have the knowledge of factual information, and factual knowledge can both be time-dependent and independent. Having said that, LLM's capability of extracting factual information from LLMs involving LRLs is an interesting task, though not much explored. In this work, we present Indic-MULAN, a benchmark dataset to evaluate LLM’s capability to extract time-aware factual knowledge involving one-to-one and one-to-many relations. Our dataset comprises 34 relations and $\sim$30K queries covering 11 Indian languages. We experimented with two LLMs, GPT-4(proprietary) and Llama-3(open-source). We find performance is poor when queried with native languages but improves when translated to English. Then, we do a brief analysis of the embedding space using t-SNE plots, which leads to some interesting observations. We hope Indic-MULAN will help future studies of LLMs involving time-aware factual knowledge in Indian languages.
Paper Type: Short
Research Area: Resources and Evaluation
Research Area Keywords: datasets for low resource languages,benchmarking
Contribution Types: Data resources, Data analysis
Languages Studied: Assamese(as), Bengali(bn), Gujarati(gu), Hindi(hi), Kannada(kn), Malayalam(ml), Marathi(mr), Odia(or), Punjabi(pa), Tamil(ta), Telugu(te)
Submission Number: 4154
Loading