A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints

Youssef Tawfilis; Hossam Amer; Minar El-Aasser; Tallal Elshabrawy

A Distributed Generative AI Approach for Heterogeneous Multi-Domain Environments under Data Sharing constraints

Youssef Tawfilis, Hossam Amer, Minar El-Aasser, Tallal Elshabrawy

Published: 16 Jan 2026, Last Modified: 16 Jan 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Federated Learning has gained increasing attention for its ability to enable multiple nodes to collaboratively train machine learning models without sharing their raw data. At the same time, Generative AI—particularly Generative Adversarial Networks (GANs)—have achieved remarkable success across a wide range of domains, such as healthcare, security, and Image Generation. However, training generative models typically requires large datasets and significant computational resources, which are often unavailable in real-world settings. Acquiring such resources can be costly and inefficient, especially when many underutilized devices—such as IoT devices and edge devices—with varying capabilities remain idle. Moreover, obtaining large datasets is challenging due to privacy concerns and copyright restrictions, as most devices are unwilling to share their data. To address these challenges, we propose a novel approach for decentralized GAN training that enables the utilization of distributed data and underutilized, low-capability devices while not sharing data in its raw form. Our approach is designed to tackle key challenges in decentralized environments, combining KLD-weighted Clustered Federated Learning to address the issues of data heterogeneity and multi-domain datasets, with Heterogeneous U-Shaped split learning to tackle the challenge of device heterogeneity under strict data sharing constraints—ensuring that no labels or raw data, whether real or synthetic, are ever shared between nodes. Experimental results shows that our approach demonstrates consistent and significant improvements across key performance metrics, where it it achieves an average 10% boost in classification metrics (up to 60% in multi domain non-IID settings), 1.1×—3× higher image generation scores for the MNIST family datasets, and 2×—70× lower FID scores for higher resolution datasets, in much lower latency compared to several benchmarks.

Submission Length: Long submission (more than 12 pages of main content)

Code: https://distributed-gen-ai.github.io/huscf-gan.github.io/

Supplementary Material: zip

Assigned Action Editor: ~Sai_Aparna_Aketi1

Submission Number: 5404

Loading