Abstract: Over the past few years, split learning has developed by leaps and bounds due to the shift towards distributed computing and the demand for data privacy. To increase the scalability of sequential split learning, scalable variants such as parallel split learning and split federated learning have been proposed, which often entail huge computation and memory consumption on the server side, limiting thus their scalability. Moreover, former aggregation-based methods generally converge with inferior rate and quality due to factors such as client drift and lag, whilst existing aggregation-free methods cannot really benefit from parallelism. In this paper, we present a novel aggregation-free split learning paradigm termed CycleSL, which can be integrated into existing algorithms to boost model performance while imposing less resource consumption. Inspired by alternating coordinate descent, CycleSL models the training task on the server side as a standalone higher-level machine learning task and updates the server and client in cyclical turns through the reuse of smashed data. Benefiting from feature resampling and alternating gradient steps, CycleSL has great potential to advance model performance and robustness. We integrate CycleSL into previous algorithms and benchmark them on four publicly available datasets with non-iid data distribution and partial client attendance. Our results show that CycleSL can notably improve model performance and convergence.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Sai_Aparna_Aketi1
Submission Number: 4932
Loading