Evaluating Credibility and Political Bias in LLMs for News Outlets in Bangladesh

Published: 22 Jun 2025, Last Modified: 17 Jul 2025ACL-SRW 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models (LLMs), Political Bias, Credibility Assessment, News Outlets
TL;DR: This study audits nine LLMs and finds that while they rate Bangladeshi news sources with internal consistency, they exhibit moderate alignment with human judgments and reveal political bias favoring pro-government outlets.
Abstract: Large language models (LLMs) are widely used in search engines to provide direct an- swers, while AI chatbots retrieve updated infor- mation from the web. As these systems influ- ence how billions access information, evaluat- ing the credibility of news outlets has become crucial. We audit nine LLMs from OpenAI, Google, and Meta to assess their ability to eval- uate the credibility and political bias of the top 20 most popular news outlets in Bangladesh. While most LLMs rate the tested outlets, larger models often refuse to rate sources due to in- sufficient information, while smaller models are more prone to hallucinations. We create a dataset of credibility ratings and political iden- tities based on journalism experts’ opinions and compare these with LLM responses. We find strong internal consistency in LLM credibil- ity ratings, with an average correlation coeffi- cient (ρ) of 0.72, but moderate alignment with expert evaluations, with an average ρ of 0.45. Most LLMs (GPT-4, GPT-4o-mini, Llama 3.3, Llama-3.1-70B, Llama 3.1 8B, and Gemini 1.5 Pro) in their default configurations favor the left-leaning Bangladesh Awami League, giving higher credibility ratings, and show misalign- ment with human experts. These findings high- light the significant role of LLMs in shaping news and political information
Archival Status: Archival
Acl Copyright Transfer: pdf
Paper Length: Long Paper (up to 8 pages of content)
Submission Number: 145
Loading