Distributional Dataset Distillation with Subtask Decomposition

ICLR 2024 Workshop DMLR Submission76 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Dataset Distillation, Synthetic Dataset Generation, Data Reduction
TL;DR: We propose to distill dataset into latent distributions with a subtask decomposition strategy
Abstract: What does a neural network learn when training from a task-specific dataset? Synthesizing this knowledge is the central idea behind Dataset Distillation, which recent work has shown can be used to compress a large dataset into a small set of input-label pairs (*prototypes*) that capture essential aspects of the original dataset. In this paper, we make the key observation that existing methods that distill into explicit prototypes are often suboptimal, incurring in unexpected storage costs from distilled labels. In response, we propose *Distributional Dataset Distillation* (D3), which encodes the data using minimal sufficient per-class statistics paired with a decoder, allowing for distillation into a compact distributional representation that is more memory-efficient than prototype-based methods. To scale up the process of learning these representations, we propose *Federated distillation*, which decomposes the dataset into subsets, distills them in parallel using sub-task experts, and then re-aggregates them. We thoroughly evaluate our algorithm using a multi-faceted metric, showing that our method achieves state-of-the-art results on TinyImageNet and ImageNet-1K. Specifically, we outperform the prior art by 6.9% on ImageNet-1K under the equivalence of 2 images per class budget.
Primary Subject Area: Optimal data for standard evaluation framework in the context of changing model landscape
Paper Type: Research paper: up to 8 pages
Participation Mode: Virtual
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 76
Loading