DICE: Data Influence Cascade in Decentralized Learning

ICLR 2025 Conference Submission7615 Authors

26 Sept 2024 (modified: 22 Nov 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Decentralized Learning, Data Influence
TL;DR: We introduce DICE, the first comprehensive framework for measuring data influence cascades in decentralized learning.
Abstract: Decentralized learning offers a promising approach to crowdsource computational workloads across geographically distributed compute interconnected through peer-to-peer networks, accommodating the exponentially increasing compute demands in the era of large models. However, the absence of proper incentives in locally connected decentralized networks poses significant risks of free riding and malicious behaviors. Data influence, which ensures fair attribution of data source contributions, holds great potential for establishing effective incentive mechanisms. Despite the importance, little effort has been made to analyze data influence in decentralized scenarios, due to non-trivial challenges arising from the distributed nature and the localized connections inherent in decentralized networks. To overcome this fundamental challenge, we propose DICE, the first framework to systematically define and estimate Data Influence CascadEs in decentralized environments. DICE establishes a new perspective on influence measurement, seamlessly integrating self-level and community-level contributions to capture how data influence cascades implicitly through networks via communication. Theoretically, the framework derives tractable approximations of influence cascades over arbitrary neighbor hops, uncovering for the first time that data influence in decentralized learning is shaped by a synergistic interplay of data, communication topology, and the curvature information of optimization landscapes. By bridging theoretical insights with practical applications, DICE lays the foundations for incentivized decentralized learning, including selecting suitable collaborators and identifying malicious behaviors. We envision DICE will catalyze the development of scalable, autonomous, and reciprocal decentralized learning ecosystems.
Primary Area: interpretability and explainable AI
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7615
Loading