True and Fair: Robust and Unbiased Fake News Detection via Interpretable Machine Learning

Chahat Raj, Anjishnu Mukherjee, Ziwei Zhu

Published: 01 Jan 2023, Last Modified: 03 Oct 2024AIES 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The dissemination of information, and consequently, misinformation, occurs at an unprecedented speed, making it increasingly difficult to discern the credibility of rapidly circulating news. Advanced large-scale language models have facilitated the development of classifiers capable of effectively identifying misinformation. Nevertheless, these models are intrinsically susceptible to biases that may be introduced through numerous ways, including contaminated data sources or unfair training methodologies. When trained on biased data, machine learning models may inadvertently learn and reinforce these biases, leading to reduced generalization performance. This situation consequently results in an inherent "unfairness" within the system. Interpretability, referring to the ability to understand and explain the decision-making process of a model, can be used as a tool to explain these biases. Our research aims to identify the root causes of these biases in fake news detection and mitigate their presence using interpretability. We also perform inference time attacks to fairness to validate robustness.