Split Learning Based GAN Training for Non-IID Federated Learning

Joana Tirana, Andreas Chouliaras, Theodoros Aslanidis, John Byabazaire, Spyridon Mastorakis, Dimitris Chatzopoulos

Published: 01 Jan 2026, Last Modified: 24 Jan 2026CrossrefEveryoneRevisionsCC BY-SA 4.0

Abstract: One crucial issue in Federated Learning (FL) is the fact that the participating devices, i.e., the data owners, are expected to collect non-Independent-and-Identically-Distributed (non-IID) data. This deteriorates training performance, i.e., delaying training or not reaching convergence. Consequently, additional methods, such as data augmentation, are applied to address this problem. In fact, producing synthetic data using GAN models has been proven to be an effective method against non-IID FL. However, the data owners need to participate in an additional FL training to train GANs in a privacy-preserving manner. Yet, this is not expected to be always feasible due to data owners’ resource limitations. In this position paper, we identify the issues that occur when training GANs with FL for data augmentation, and propose a lightweight alternative that utilizes Split Learning (SL) to offload the computational load into a compute node. Further, we highlight the gap of cloud-based FL-SL integration and propose a microservice architecture based on existing tools that could significantly enhance the FL-SL deployment and orchestration, while balancing the energy and the financial cost.

External IDs:doi:10.1007/978-3-032-13744-9_5