Understanding QA Generation: Extracting Parametric and Contextual Knowledge with CQA for Low-Resource Bangla Language

Understanding QA Generation: Extracting Parametric and Contextual Knowledge with CQA for Low-Resource Bangla Language

ACL ARR 2025 May Submission2650 Authors

19 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Question-Answering (QA) models for low-resource languages like Bangla face challenges due to limited annotated data and linguistic complexity. A key issue is determining whether models rely more on pre-encoded (parametric) knowledge or contextual input during answer generation as existing Bangla QA datasets lack the structure required for such analysis. We introduce BanglaCQA, the first Counterfactual QA dataset in Bangla by integrating counterfactual passages and answerability annotations into an existing dataset. In addition, we propose prompting-based pipelines for LLMs to disentangle parametric and contextual knowledge in both factual and counterfactual scenarios. Furthermore, we apply LLM-based evaluation techniques that measure answer quality based on semantic similarity. Our work not only introduces a novel framework for analyzing knowledge sources in Bangla QA but also uncovers critical findings that open up broader directions for counterfactual reasoning in low-resource language settings.

Paper Type: Short

Research Area: Question Answering

Research Area Keywords: Question Answering, Resources and Evaluations, Language Modeling

Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Data resources

Languages Studied: Bangla

Submission Number: 2650

Loading