Computation Offloading and Resource Allocation for Deep Neural Network Inference in UAV Wireless Networks
Abstract: Unmanned aerial vehicles (UAVs) can be equipped with relatively strong servers so they can collaboratively perform inference for pre-trained Deep Neural Networks (DNNs), enabling complex recognition tasks based on onboard sensing data such as image and video. Such the collaborative inference is critical for applications where ground communications and computing infrastructure is not available, not secure or costefficient such as those for military, disaster recovery and rescue. Collaborative DNN inference in the UAV wireless network, is, however, challenging because one must decide how the computation load related to different layers of the DNN is distributed among UAVs and how to efficiently allocate both radio and computing resources to facilitate the underlying offloading process. To this end, we formulate the joint DNN layer assignment, radio and computing resource allocation problem as an optimization problem which aims to minimize the total inference latency. To solve this difficult mixed integer and non-linear problem, we employ an alternating optimization technique and develop an efficient algorithm, named LARA. Numerical studies show that LARA performs very well in different studied scenarios and achieves up to 80 % improvement in terms of inference latency compared to other baselines which perform DNN layer assignment and resource allocation in a heuristic manner. Furthermore, LARA achieves up to 43.17 % improvement compared to OULD [13] framework.
External IDs:dblp:conf/icc/IsmailL25
Loading