Evaluating Socio-Ecological Bias in Retrieval Augmented Generation: A Case Study on Interdisciplinary Agricultural Resilience
Keywords: Retrieval-Augmented Generation (RAG), Large Language Models (LLMs), Domain Bias, Interdisciplinary Evaluation, Socio-Ecological Systems, Agricultural Resilience, Bias in NLP, Trustworthy AI
Abstract: This study examines domain bias in Retrieval-Augmented Generation (RAG) systems within the socio-ecological context of agricultural resilience. Leveraging a multi-model framework comprising DeepSeek-R1 and Llama-3.2 as generative backbones, paired with Nomic-Embed-Text and EmbeddingGemma for document embedding, we construct balanced corpora of ecological and social science articles and design two controlled experiments to disentangle retrieval and prompt effects. The results reveal a nuanced, multi-stage bias pattern: while the retrieval stage exhibits a consistent preference for ecological variables (particularly in Context Relevance), the generation stage demonstrates a significant reversal, favoring social variables in response to faithfulness under prompt-bias conditions. Our findings highlight the hidden risks associated with domain bias present in RAG applications in socio-ecological policy-making.
Paper Type: Long
Research Area: Computational Social Science, Cultural Analytics, and NLP for Social Good
Research Area Keywords: domain bias, retrieval bias, prompt bias, model evaluation
Contribution Types: Model analysis & interpretability, Data analysis
Languages Studied: english
Submission Number: 3926
Loading