When Bias Detection Breaks: The limits of supervised models across news sources

Juan Ignacio Llaberia

Published: 09 Feb 2026, Last Modified: 09 Feb 2026OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Political bias detection in news articles is often evaluated using random train–test splits, which can allow models to exploit publisher-specific artifacts rather than learn genuine ideological signals. We study the impact of this issue by comparing random splits with an outlet-controlled split where publishers are disjoint across training and test sets. Using 37,554 U.S. news articles labeled by outlet bias, we evaluate traditional machine learning models with sentence embeddings, a fine-tuned Transformer (ModernBERT), and zero- and few-shot prompting with a large language model. Results show that supervised models degrade sharply under outlet-controlled evaluation, with traditional approaches near chance and Transformers achieving only modest gains. In contrast, large language models generalize better to unseen outlets. These findings demonstrate that random splits overestimate performance and highlight the importance of outlet-controlled evaluation for robust political bias detection.