Projected Latent Distillation for Data-Agnostic Consolidation in Multi-Agent Continual LearningDownload PDF

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone
Keywords: continual learning, knowledge distillation
Abstract: Many real-world applications are characterized by non-stationary distributions. In this setting, independent expert models trained on subsets of the data can benefit from each other and improve their generalization and forward transfer by sharing knowledge. In this paper, we formalize this problem as a multi-agent continual learning scenario, where agents are trained independently but they can communicate by sharing the model parameters after each learning experience. We split the learning problem into two phases: adaptation and consolidation. Adaptation is a learning phase that optimizes the current task, while consolidation prevents forgetting by combining expert models together, enabling knowledge sharing. We propose Data-Agnostic Consolidation (DAC), a novel double knowledge distillation method. The method performs distillation in the latent space via a novel Projected Latent Distillation (PLD) loss. Experimental results show state-of-the-art accuracy on SplitCIFAR100 even when a single out-of-distribution image is used as the only source of data during consolidation.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
Supplementary Material: zip
9 Replies