Keywords: Federated Offline Reinforcement Learning, Client Capacity Heterogeneity, Progressive Distillation, Model-Heterogeneous Federated Learning, Edge Intelligence
TL;DR: We enable low-capacity clients to participate in federated offline reinforcement learning without sacrificing full-model performance via masking and progressive distillation.
Abstract: Federated offline reinforcement learning (FORL) is a promising abstraction for distributed edge decision making, particularly in resource-heterogeneous wireless environments with decentralized operational logs. Existing FORL methods, however, typically assume that every client can train the same full policy model, which is unrealistic when memory, compute, and energy budgets vary substantially across clients. We study FORL under client capacity heterogeneity, where high-capacity clients train a full policy model while low-capacity clients can train only masked constrained models. We propose a capacity-aware framework that combines a high-capacity warm start, aggregation-compatible masking, masked parameter aggregation, and server-side progressive distillation to transfer knowledge from an evolving full model to a constrained model that remains trainable by low-capacity clients. To isolate this algorithmic question, we evaluate Decision Transformer on standard D4RL locomotion tasks rather than domain-specific wireless simulators. Results provide preliminary evidence that low-capacity clients can be incorporated without collapsing full-model performance, while improving constrained-model performance over uniformly low-capacity federated training.
Submission Number: 19
Loading