Assisted Learning for Organizations with Limited Imbalanced Data

Published: 06 May 2023, Last Modified: 06 May 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: In the era of big data, many big organizations are integrating machine learning into their work pipelines to facilitate data analysis. However, the performance of their trained models is often restricted by limited and imbalanced data available to them. In this work, we develop an assisted learning framework for assisting organizations to improve their learning performance. The organizations have sufficient computation resources but are subject to stringent data-sharing and collaboration policies. Their limited imbalanced data often cause biased inference and sub-optimal decision-making. In assisted learning, an organizational learner purchases assistance service from an external service provider and aims to enhance its model performance within only a few assistance rounds. We develop effective stochastic training algorithms for both assisted deep learning and assisted reinforcement learning. Different from existing distributed algorithms that need to frequently transmit gradients or models, our framework allows the learner to only occasionally share information with the service provider, but still obtain a model that achieves near-oracle performance as if all the data were centralized.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Tie-Yan_Liu1
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 614