Keywords: Meta-Learning, Few-shot learning
TL;DR: Analysis of different task sampling schemes for meta-learning to understand the effect of task diversity in meta-learning
Abstract: Few-shot learning aims to learn representations that can tackle novel tasks given a small number of examples. Recent studies show that task distribution plays a vital role in the performance of the model. Conventional wisdom is that task diversity should improve the performance of meta-learning. In this work, we find evidence to the contrary; we study different task distributions on a myriad of models and datasets to evaluate the effect of task diversity on meta-learning algorithms. For this experiment, we train on two datasets - Omniglot and miniImageNet and with three broad classes of meta-learning models - Metric-based (i.e., Protonet, Matching Networks), Optimization-based (i.e., MAML, Reptile, and MetaOptNet), and Bayesian meta-learning models (i.e., CNAPs). Our experiments demonstrate that the effect of task diversity on all these algorithms follows a similar trend, and task diversity does not seem to offer any benefits to the learning of the model. Furthermore, we also demonstrate that even a handful of tasks, repeated over multiple batches, would be sufficient to achieve a performance similar to uniform sampling and draws into question the need for additional tasks to create better models.
Contribution Process Agreement: Yes
Author Revision Details: **(1)Area Chair TWin: considering more complex few-shot learning scenarios like cross-domain setups, as it would be interesting to examine whether the conclusions are different there.** * We extend our experiments to more complex datasets such as Meta-Dataset and *tiered*ImageNet and assure that our findings have been consistent. Furthermore, we also experiment with few-shot regression datasets to support the same. We hope to push our paper to arxiv very soon. --- **(2) Reviewer A95p: Low novelty and the result findings are not surprising. Code is not provided and some important details are missing.** * Although we agree with the reviewer that there is little technical novelty in our work, our works shed light on an important observation - that diversity does not improve performance. our goal in this work was to study the effect of diversity in meta-learning, not to propose a sampler that performs best. What we bring to the table is the fact that increased task diversity harms the model (which is against conventional wisdom in meta-learning), and that we can achieve similar performance to uniform sampling with only a handful of data. The first would imply that training on diverse datasets does not guarantee any boost in performance. And the creation of complicated datasets, with very diverse tasks, might not be the path to better models. The second would be a finding very useful in practical scenarios. When data is limited, and other samplers such as NDB, etc. can achieve similar performance to Uniform Sampling, it really brings to question our efficiency in using samplers such as Uniform Sampling. Although the model has seen more data, its performance is not any better. For these two reasons, we believe the analysis performed in this work does bring some practical results useful to the general audience. Furthermore, we have shared our code now, and not earlier to uphold the anonymous and double-blind evaluation process. --- **(3) Reviewer hXQo: ** * **Novelty & Significance:** Setlur et al. (2021) study a specific sampler by limiting the pool of tasks. The goal of their paper has been to propose a sampler that is robust even when working with a limited pool of tasks. Furthermore, their paper empirically proves that limiting task diversity does not have adverse effects. In that aspect, we believe our paper to be fundamentally different from theirs. We attempt to study with different levels of task diversity and attempt to disprove the conventional wisdom that “task diversity is good for learning”, by considering the other end of the spectrum (increased task diversity) using samplers such as OHTM, sDPP, and dDPP - something the Setlur et al. (2021) does not explore. For this reason, we do believe our paper brings some important findings. * **Potential Impact:** We refer the author to our comment to reviewer A95p. * **Technical Quality:** We have performed a more extensive study on the mentioned datasets as mentioned by the reviewer. * **Clarity and Reproducibility:** Our goal of the paper was to study task diversity. DPP is quite a common procedure to look into when we wish to improve diversity. If the reviewer knows any simpler alternates, please let us know, and we will try to add them to our experiments in the repo. As for the reproducibility, we have shared our code now, and not earlier to uphold the anonymous and double-blind evaluation process. --- **(4)Reviewer LPJa:** * I hope we have addressed some of your concerns about novelty in our comment to reviewer hXQo, and believe we have been able to differentiate ourselves from them in your eyes. * For your concerns on reproducing the samplers from [18,10], we would like to point out that : * Our OHTM sampler is an adaption of  where we sample the hardest task and not just the hardest class, but the principle remains the same. * Our low diversity samplers such as NDT is an extreme case of the sampler proposed in , where the support set pool has been fixed to the number of classes per task. We were not able to reproduce the samplers as is, but I hope our adaptations are acceptable to you. * We have shared our GitHub repo in the paper to help with the issues regarding reproducibility. * Thank you very much for your suggestions to improve our paper. We made sure to run on more challenging datasets such as Meta-Dataset, *tiered*ImageNet, and few-shot regression datasets such as Sinusoid, Sinusoid & Line, and Harmonic to maintain our findings across different datasets and settings. --- Again, thank you very much for your valuable feedback and time!
Poster Session Selection: Poster session #2 (15:00 UTC)