Improving Generalization for Small Datasets with Data-Aware Dynamic Reinitialization

Vijaya Raghavan T Ramkumar; Bahram Zonooz; Elahe Arani

Improving Generalization for Small Datasets with Data-Aware Dynamic Reinitialization

Vijaya Raghavan T Ramkumar, Bahram Zonooz, Elahe Arani

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Generalization, Weight reinitialization, Iterative training, Overfitting, Small datasets

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: A novel iterative learning paradigm with data-aware dynamic masking removes redundant connections, increases DNNs' capacity for learning, and improves generalization on small datasets.

Abstract: The efficacy of deep learning techniques is contingent upon copious volumes of data (labeled or unlabeled). Nevertheless, access to such data is frequently restricted in practical domains such as medical applications. This presents a formidable obstacle: How can we effectively train a deep neural network on a relatively small dataset while improving generalization? Recent works explored evolutionary or iterative training paradigms, which reinitialize a subset of the parameters to improve generalization performance for small datasets. While effective, these methods randomly select the subset of parameters and maintain a fixed mask throughout iterative training, which can be suboptimal. Motivated by the process of neurogenesis in the brain, we propose a novel iterative training framework, Selective Knowledge Evolution (SKE), that employs a data-aware dynamic masking scheme to eliminate redundant connections by estimating their significance, thereby increasing the model's capacity for further learning via random weight reinitialization. The experimental results demonstrate that our approach outperforms existing methods in accuracy and robustness, highlighting its potential for real-world applications where collecting data is challenging.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: pdf

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 8198

Loading