Keywords: Few-shot learning, Meta-Dataset, Benchmarks, Evaluation
TL;DR: We propose (i) a general and computationally efficient algorithm to extract difficult few-shot classification tasks from large-scale vision datasets, and (ii) a new test benchmark of these difficult tasks to stress test few-shot classifiers.
Abstract: Few-shot classification is the ability to adapt to any new classification task from only a few training examples. The performance of current top-performing few-shot classifiers varies widely across different tasks where they often fail on a subset of `difficult' tasks. This phenomenon has real-world consequences for deployed few-shot systems where safety and reliability are paramount, yet little has been done to understand these failure cases. In this paper, we study these difficult tasks to gain a more nuanced understanding of the limitations of current methods. To this end, we develop a general and computationally efficient algorithm called FastDiffSel to extract difficult tasks from any large-scale vision dataset. Notably, our algorithm can extract tasks at least 20x faster than existing methods enabling its use on large-scale datasets. We use FastDiffSel to extract difficult tasks from Meta-Datasset, a widely-used few-shot classification benchmark, and other challenging large-scale vision datasets including ORBIT, CURE-OR and ObjectNet. These tasks are curated into Hard-MD++, a new few-shot testing benchmark to promote the development of methods that are robust to even the most difficult tasks. We use Hard-MD++ to stress-test an extensive suite of few-shot classification methods and show that state-of-the-art approaches fail catastrophically on difficult tasks. We believe that our extraction algorithm FastDiffSel and Hard-MD++ will aid researchers in further understanding failure modes of few-shot classification models.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning