SAC: A Novel Multi-hop Routing Policy in Hybrid Distributed IoT System based on Multi-agent Reinforcement Learning

Published: 01 Jan 2021, Last Modified: 11 Apr 2025ISQED 2021EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Energy harvesting (EH) IoT devices have attracted vast attention in both academia and industry as they can work sustainably by harvesting energy from the ambient environment. However, due to the weak and transient nature of harvesting power, EH technology is unable to support power-intensive IoT devices such as IoT edge servers. Therefore, the hybrid IoT system where the EH IoT devices and non-EH IoT devices co-exist is forthcoming. This paper explored the routing problem in such a hybrid distributed IoT system. We first proposed a comprehensive multi-hop routing mechanism of this hybrid system. After that, we proposed a distributed multi-agent deep reinforcement learning algorithm, known as spatial asynchronous advantage actor-critic (SAC), to optimize the system routing policy and energy allocation while maximizing the total amount of transmitted data and the overall data delivery to the sink node. The experiments indicate that SAC can averagely complete at least $\sim 1.5 \times$ transmission rate and $\sim 12.9\times$ Sink packet delivery rate compared with the baselines.
Loading