PipeMCTS: Pipeline Inference Optimization for Edge Computing via Surrogate Model-Guided MCTS

Zhenyi Wang, Jianjun Ding, Dan Xian, Yusheng Jin, Dejun Hua, Wenbin Hua, Wenkai Lv, Chengmin Lin

Published: 2025, Last Modified: 16 Apr 2026IEEE Internet Things J. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: For deep neural network (DNN) deployment on Internet of Things (IoT) devices, pipeline-based inference coordinates diverse computing resources in heterogeneous multiprocessor system-on-chips (HMPSoCs) to achieve efficient execution. However, determining the optimal pipeline configuration is challenging due to the exponentially expanding search space and intricate layer-wise dependencies. Existing methods formulate this as a single-shot optimization problem, which struggles to efficiently explore the search space and incurs substantial resource overhead from repeated evaluations of suboptimal configurations. This article proposes PipeMCTS, which reformulates pipeline deployment as a sequential optimization problem solved via Monte Carlo tree search (MCTS). By incrementally constructing the search tree in a layer-wise manner, PipeMCTS effectively prunes unpromising branches and accelerates the search process. PipeMCTS incorporates two key components to enhance search efficiency: 1) a temperature-controlled selection strategy that balances exploration and exploitation in the complex search space and 2) an uncertainty-aware simulation strategy that accelerates search convergence through Gaussian process (GP)-guided evaluation. Experimental results demonstrate that PipeMCTS achieves a 66.42% improvement in throughput compared to arm compute library (ARM-CL).
Loading