Abstract: Highlights•A new performance prediction framework (termed PerfTop) is proposed to accurately predict the execution time of distributed learning over general topologies.•The framework provides an in-depth analysis of the underlying mechanisms of communication and considers the overlap between computation and communication.•Extensive experiments show that PerTop achieves an accuracy of above 85% in predicting the iteration time of distributed training over general topologies.
Loading