Abstract: For deep neural network (DNN) deployment on Internet of Things (IoT) devices, pipeline-based inference coordinates diverse computing resources in heterogeneous multiprocessor system-on-chips (HMPSoCs) to achieve efficient execution. However, determining the optimal pipeline configuration is challenging due to the exponentially expanding search space and intricate layer-wise dependencies. Existing methods formulate this as a single-shot optimization problem, which struggles to efficiently explore the search space and incurs substantial resource overhead from repeated evaluations of suboptimal configurations. This article proposes PipeMCTS, which reformulates pipeline deployment as a sequential optimization problem solved via Monte Carlo tree search (MCTS). By incrementally constructing the search tree in a layer-wise manner, PipeMCTS effectively prunes unpromising branches and accelerates the search process. PipeMCTS incorporates two key components to enhance search efficiency: 1) a temperature-controlled selection strategy that balances exploration and exploitation in the complex search space and 2) an uncertainty-aware simulation strategy that accelerates search convergence through Gaussian process (GP)-guided evaluation. Experimental results demonstrate that PipeMCTS achieves a 66.42% improvement in throughput compared to arm compute library (ARM-CL).
External IDs:dblp:journals/iotj/WangDXJHHLL25
Loading