Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks

Published: 01 Jan 2026, Last Modified: 18 May 2026IEEE Wireless Communications LettersEveryoneRevisionsCC BY-SA 4.0
Abstract: Efficient spectrum allocation is vital for 5G/6G networks, yet traditional deep reinforcement learning (DRL) methods suffer from high sample complexity and unsafe exploration that can disrupt network stability. To address these challenges, we propose a meta-learning framework that learns a robust initial policy capable of rapid and safe adaptation to changing wireless conditions. We implement three meta-learning architectures using model-agnostic techniques—model-agnostic meta-learning (MAML), recurrent neural network (RNN), and RNN with a self-attention mechanism—and compare them against a DRL baseline and classical heuristic approaches in a dynamic integrated access/backhaul (IAB) environment. The attention-based agent achieves a peak throughput of $\approx 49$ Mbps, reducing SINR and latency violations by over 60% relative to PPO, and attains 97% of the fairness level of the exhaustive-search upper bound. These results demonstrate that meta-learning enables data-efficient, reliable, and scalable spectrum management for next-generation wireless systems.
Loading