The Foes of Neural Network’s Data Efficiency Among Unnecessary Input DimensionsDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Data Efficiency, Small Sample Size, Data Dimensionality, Image Background
Abstract: Input dimensions are unnecessary for a given task when the target function can be expressed without such dimensions. Object's background in image recognition or redundant sentences in text classification are examples of unnecessary dimensions that are often present in datasets. Deep neural networks achieve remarkable generalization performance despite the presence of unnecessary dimensions but it is unclear whether these dimensions negatively affect neural networks or how. In this paper, we investigate the impact of unnecessary input dimensions on one of the central issues of machine learning: the number of training examples needed to achieve high generalization performance, which we refer to as the network's data efficiency. In a series of analyses with multi-layer perceptrons and deep convolutional neural networks, we show that the network's data efficiency depends on whether the unnecessary dimensions are \emph{task-unrelated} or \emph{task-related} (unnecessary due to redundancy). Namely, we demonstrate that increasing the number of \emph{task-unrelated} dimensions leads to an incorrect inductive bias and as a result degrade the data efficiency, while increasing the number of \emph{task-related} dimensions helps to alleviate the negative impact of the \emph{task-unrelated} dimensions. These results highlight the need for mechanisms that remove \emph{task-unrelated} dimensions, such as crops or foveation for image classification, to enable data efficiency gains.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: Increasing the number of task-unrelated dimensions degrade the network's data efficiency, while increasing the number of task-related dimensions helps to alleviate the negative impact of the task-unrelated dimensions.
Reviewed Version (pdf): https://openreview.net/references/pdf?id=D_kmb2AMT
5 Replies

Loading