Contrastive Federated Learning for Non-IID and Imbalanced Data in Computational Social Systems

Kaixiang Yang, Huijun Chen, Wuxing Chen, Chien-Ming Chen, Song Sun, Saru Kumari, Cheng Qiao

Published: 01 Jan 2025, Last Modified: 22 Jan 2026IEEE Transactions on Computational Social SystemsEveryoneRevisionsCC BY-SA 4.0

Abstract: In the era of computational social systems (CSS), where AI is increasingly embedded in critical domains such as public services, healthcare, and social governance, concerns about system vulnerabilities have become paramount. Federated learning (FL) offers a promising solution by enabling privacy-preserving and secure AI model training in real-world applications. However, traditional FL approaches often assume that client data is either imbalanced or non-independent and identically distributed (non-IID), whereas in practice, these issues frequently coexist. In such complex scenarios, minority classes may appear only on a small subset of clients, significantly degrading global model performance and convergence. In this article, we propose a novel FL framework designed to tackle the intertwined challenges of data imbalance and non-IID data by a principled, decoupled approach. Unlike existing methods that address these issues in isolation, our approach introduces three synergistic innovations, each operating at a different level of the FL system. At the data level, a global rebalancing mechanism employs a privacy-preserving modified Z-score to rectify class imbalance. At the client level, a distribution-aware clustering strategy groups participants based on data subspace similarity. At the feature level, a novel contrastive loss function, leveraging the effective number of samples, simultaneously mitigates local imbalance and model drift. This cohesive framework ensures that interventions at each level reinforce each other, leading to robust performance. Extensive experiments show that FedScFc outperforms state-of-the-art baselines.

External IDs:doi:10.1109/tcss.2025.3632830