Introducing Sample RobustnessDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Abstract: Choosing the right data and model for a pre-defined task is one of the critical competencies in machine learning. Investigating what features of a dataset and its underlying distribution a model decodes may enlighten the mysterious "black box" and guide us to a deeper and more profound understanding of the ongoing processes. Furthermore, it will help to improve the quality of models which directly depend on data or learn from it through training. In this work, we introduce the dataset-dependent concept of sample robustness, which is based on a point-wise Lipschitz constant of the label map. For a particular sample, it measures how small of a perturbation is required to cause a label-change relative to the magnitude of the label map. We introduce theory to motivate the concept and to analyse the effects of having similar robustness distributions for the training- and test data. Afterwards, we conduct various experiments using different datasets and (non-)deterministic models. In some cases, we can boost performance by choosing specifically tailored training(sub)sets and hyperparameters depending on the robustness distribution of the test(sub)sets.
One-sentence Summary: We introduce the concept of sample robustness for measuring how sensitive elements of a dataset are towards label-changing perturbations, followed by a theoretical discussion and an empirical analysis.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=Hp8ic2DBBb
10 Replies

Loading