A comparison of dataset distillation and active learning in text classificationDownload PDF

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone
Keywords: knowledge distillation, dataset distillation, active learning, text classification
Abstract: Deep learning has achieved great success over the past few years in different aspects ranging from computer vision to natural language process. However, the huge size of data in deep learning has always been a thorny problem in learning the underlying distribution and tackling various human tasks. To alleviate this problem, knowledge distillation has been proposed to simplify the model, and later dataset distillation as a new method of reducing dataset sizes has been proposed, which aims to synthesize a small number of samples that contain all the information of a very large dataset. Meanwhile, active learning is also an effective method to reduce dataset sizes by only selecting the most significant labeling samples from the original dataset. In this paper, we explore the discrepancies in the principles of dataset distillation and active learning, and evaluate two algorithms on NLP classification dataset: Stanford Sentiment Treebank. The result of the experiment is that the distilled data with the size of 0.1% of the original text data achieves approximately 88% accuracy, while the selected data achieves 52% performance of the original data.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)
4 Replies

Loading