HeterPS: Distributed deep learning with reinforcement learning based scheduling in heterogeneous environments

Published: 01 Jan 2023, Last Modified: 15 Nov 2024Future Gener. Comput. Syst. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•The design of HeterPS for the distributed training with heterogeneous resources.•A novel Reinforcement Learning (RL)-based layer scheduling method.•A dynamic method to determine the proper number of computing resources.•An extensive evaluation based on multiple DNN models with divers baseline approaches.
Loading