Split Learning on Multi-source Cross-Streams

Published: 01 Jan 2024, Last Modified: 18 Jul 2025ICONIP (5) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In recent years, distributed data stream mining has gained increasing prominence in biomedical information (e.g., disease prediction), as various institutes/hospitals continuously accumulate vast patient data. While split learning is commonly employed for privacy preservation, its application encounters challenges in real-world distributed data streaming scenarios. Unlike the conventional assumption of one-to-one correspondence between data sources and clients, these scenarios involve data streams originating from multiple sources flowing across various hospitals. This complexity leads to substantial communication overheads associated with the sequential data analysis and dynamic data updates brought about by data streaming learning, posing significant hurdles. To overcome these challenges, we propose SLStream, a novel split learning framework tailored for multi-source cross-streaming data. SLStream employs training scheduling to accommodate diverse patient visit sequences, optimizing batch processing. Furthermore, it introduces a mechanism to assess the value of patient records for judicious data updating. Extensive experiments validate SLStream’s efficacy, demonstrating notable improvements in both communication efficiency and model performance.
Loading