A four-year-old can outperform ResNet-50: Out-of-distribution robustness may not require large-scale experienceDownload PDF

12 Oct 2021, 16:13 (modified: 13 Dec 2021, 09:56)SVRHM 2021 PosterReaders: Everyone
Keywords: out-of-distribution, robustness, large-scale training, deep learning, devlopment, children, object recognition
TL;DR: Human out-of-distribution robustness emerges very early in development and compared to various deep learning models, children's high out-of-distribution robustness requires relatively little data.
Abstract: Recent gains in model robustness towards out-of-distribution images are predominantly achieved through ever-increasing large-scale datasets. While this approach is very effective in achieving human-level distortion robustness, it raises the question of whether human robustness, too, requires massive amounts of experience. We therefore investigated the developmental trajectory of human object recognition robustness by comparing children aged 4–6, 7–9, 10-–12, 13–15 against adults and against different deep learning models. Assessing how recognition accuracy degrades when images are distorted by salt-and-pepper noise, we find that while overall performance improves with age, even the youngest children in the study showed remarkable robustness and outperformed standard CNNs and self-supervised models on distorted images. In order to compare the robustness of different age groups and models as a function of visual experience, we used a back-of-the-envelope calculation to estimated the number of `images' that those young children had been exposed to during their lifetime. We find that for humans, more data does not necessarily lead to better out-of-distribution robustness. Compared to various deep learning models, children's high out-of-distribution robustness requires relatively little data. Taken together, this indicates that human out-of-distribution robustness develops very early in life and may not require seeing billions of different images during lifetime given the right choice of representation and information processing optimised during evolution.
5 Replies

Loading