Abstract: Large-scale datasets have made impressive progress in deep learning. However, storing datasets and training neural network models on large datasets have become increasingly expensive. In this paper, we present an effective dataset compression approach based on the matrix product states (short as MPS) and knowledge distillation. MPS can decompose image samples into a sequential product of tensors to achieve task-agnostic image compression by preserving the low-rank information of the images. Based on this property, we use multiple MPS to represent the image datasets samples. Meanwhile, we also designed a task-related component based on knowledge distillation to enhance the generality of the compressed dataset. Extensive experiments have demonstrated the effectiveness of the proposed approach in image datasets compression, especially obtaining better model performance (2.26\(\%\) on average) than the best baseline method on the same compression ratio.
Loading