Position: Political Neutrality in AI Is Impossible — But Here Is How to Approximate It

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 Position Paper Track oralEveryoneRevisionsBibTeXCC BY-NC-ND 4.0
TL;DR: True political neutrality in AI is impossible, but we can (and sometimes should) approximate it.
Abstract: AI systems often exhibit political bias, influencing users' opinions and decisions. While political neutrality—defined as the absence of bias—is often seen as an ideal solution for fairness and safety, this position paper argues that true political neutrality is neither feasible nor universally desirable due to its subjective nature and the biases inherent in AI training data, algorithms, and user interactions. However, inspired by Joseph Raz's philosophical insight that "neutrality [...] can be a matter of degree" (Raz, 1986), we argue that striving for some neutrality remains essential for promoting balanced AI interactions and mitigating user manipulation. Therefore, we use the term "approximation" of political neutrality to shift the focus from unattainable absolutes to achievable, practical proxies. We propose eight techniques for approximating neutrality across three levels of conceptualizing AI, examining their trade-offs and implementation strategies. In addition, we explore two concrete applications of these approximations to illustrate their practicality. Finally, we assess our framework on current large language models (LLMs) at the output level, providing a demonstration of how it can be evaluated. This work seeks to advance nuanced discussions of political neutrality in AI and promote the development of responsible, aligned language models.
Lay Summary: AI systems often exhibit political bias, subtly shaping users’ beliefs and decisions. A common response is to call for political neutrality—but true neutrality is difficult to define and even harder to achieve, given the biases embedded in training data, algorithms, and user interactions. Our research tackles this challenge by reframing neutrality as something that can be *approximated*, rather than perfectly attained. Building on philosophical insights that neutrality exists in degrees, we propose a practical framework for approximating political neutrality in AI. We outline eight techniques across three levels of abstraction: output (the model’s responses), system (the model itself), and ecosystem (the broader landscape of AI models). Because each technique is an approximation, we also examine their trade-offs and limitations. We apply this framework to two real-world use cases and show how it can be used to evaluate large language models at the output level. This allows researchers and developers to better understand where models fall short and how they might be improved. By moving away from unattainable ideals and toward actionable strategies, our work supports the development of more responsible, trustworthy AI systems that are less likely to manipulate or polarize users.
Link To Code: https://github.com/jfisher52/Approximation_Political_Neutrality
Primary Area: System Risks, Safety, and Government Policy
Keywords: Political Bias, Political Neutrality, AI Ethics, AI Safety
Submission Number: 343
Loading