Abstract: To better enhance the network service for different user devices in various scenarios, autonomous aerial vehicles (AAVs) are increasingly used as aerial base stations (ABSs). However, optimizing coverage for user devices via AAV team control is an NP-hard problem and escalates exponentially in complexity with the growing number of user devices. To address this challenge, researchers have turned to reinforcement learning (RL) for a more practical solution. With the growing prevalence of the Internet of Things (IoT), the diversity of user devices increases, posing challenges for traditional RL, as 1) the spatial distribution of devices becomes more complex; 2) variations in device types and device mobility increase the training latency; 3) the high-speed movement of IoT devices can lead to performance deterioration in widely used RL algorithms with discrete action space; and 4) traditional RL struggles to adapt to new environments. To solve these problems, we propose a new meta-RL framework, Meta-RL with explicit task inference (Meta-ETI). Then, we apply this framework to efficiently train an energy-efficient AAV control policy for fair and effective coverage in 3-D dynamic environments. Meta-ETI is evaluated in both theoretical and application-related aspects and demonstrates superior performance compared to the baseline frameworks. The result shows that Meta-ETI demonstrates 2–3 times faster adaptation speed and a decent performance in sample efficiency. Furthermore, in the AAV-IoT coverage application, Meta-ETI shows 30%–50% better in energy efficiency and 40%–60% more served devices because of the fair coverage.
External IDs:dblp:journals/iotj/HuangSP25
Loading