Heteroskedastic and Imbalanced Deep Learning with Adaptive RegularizationDownload PDF

28 Sept 2020, 15:48 (modified: 10 Feb 2022, 11:45)ICLR 2021 PosterReaders: Everyone
Keywords: deep learning, noise robust learning, imbalanced learning
Abstract: Real-world large-scale datasets are heteroskedastic and imbalanced --- labels have varying levels of uncertainty and label distributions are long-tailed. Heteroskedasticity and imbalance challenge deep learning algorithms due to the difficulty of distinguishing among mislabeled, ambiguous, and rare examples. Addressing heteroskedasticity and imbalance simultaneously is under-explored. We propose a data-dependent regularization technique for heteroskedastic datasets that regularizes different regions of the input space differently. Inspired by the theoretical derivation of the optimal regularization strength in a one-dimensional nonparametric classification setting, our approach adaptively regularizes the data points in higher-uncertainty, lower-density regions more heavily. We test our method on several benchmark tasks, including a real-world heteroskedastic and imbalanced dataset, WebVision. Our experiments corroborate our theory and demonstrate a significant improvement over other methods in noise-robust deep learning.
One-sentence Summary: We propose a data-dependent regularization technique for learning heteroskedastic and imbalanced datasets.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Code: [![github](/images/github_icon.svg) kaidic/HAR](https://github.com/kaidic/HAR)
Data: [CIFAR-10](https://paperswithcode.com/dataset/cifar-10), [CIFAR-100](https://paperswithcode.com/dataset/cifar-100), [WebVision](https://paperswithcode.com/dataset/webvision-database)
10 Replies