Not All Tasks are Equal - Task Attended Meta-learning for Few-shot Learning

TMLR Paper615 Authors

19 Nov 2022 (modified: 28 Feb 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Meta-learning (ML) has emerged as a promising direction in learning models under constrained resource settings like few-shot learning. The popular approaches for ML either learn a generalizable initial model or a generic parametric optimizer through batch episodic training. In this work, we study the importance of tasks in a batch for ML. We hypothesize that the common assumption in batch episodic training where each task in a batch has an equal contribution to learning an optimal meta-model need not be true. We propose to weight the tasks in a batch according to their "importance'' in improving the meta-model's learning. To this end, we introduce a training curriculum called task attended meta-training to learn a meta-model from weighted tasks in a batch. The task attention module is a standalone unit and can be integrated with any batch episodic training regimen. Comparison of task-attended ML models with their non-task-attended counterparts on complex datasets, performance improvement of proposed curriculum over state-of-the-art task scheduling algorithms on noisy datasets, and cross-domain few shot learning setup validate its effectiveness.
Submission Length: Long submission (more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=2R0fQfnijt&noteId=EaSrSeY9S1&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)
Changes Since Last Submission: Dear Editor, We submitted our work entitled "Not All Tasks are Equal - Task Attended Meta-learning for Few-shot Learning" to TMLR and received feedback with a rejection and encouragement to resubmit. We have addressed the reviewers' concerns in the current draft and colored them in magenta for easy identification. Please find below the precise responses to the most dissenting reviewer's primary concerns that we received from the editor. Due to space limitations, we have added detailed answers to the first few pages of the draft. **(1) Performance concern:** Our baselines (models without attention) denoted by * in Table 2 were lower than those reported in the literature because of the differences in the reported (in the literature) and our experimental setup. The decreased accuracy of the baselines also propagates to their task-attended counterparts. We now perform an additional set of experiments using the reported setups (denoted by #) and show the merit of the proposed approach even on those setups (Table 2 main paper). **(2) Motivation concern:** We added the motivational difference of our approach with global task sampling approaches in section 2 (magenta color) and empirically show that our weighting mechanism imparts better generalizability to the meta-model than the global weighting of the tasks. This is showed in the Tables 3, 4 and 5 (main paper) for various algorithms (MAML, ANIL, MetaSGD), under different few-shot settings (5.1, 5.5), datasets (miniImagenet, miniImagenet noisy) and dataset properties (In distribution, Noisy distribution, and Cross-domain). Thus, we believe that the approach and the results are relevant to the meta-learning community. **(3) Efficiency concern:** We agree that the training time for all scheduling /sampling approaches is expected to be higher than the non-scheduling/sampling counterparts. As training is typically performed offline, the increased computational overhead is expected to be permissible. Further, ours, as well as other scheduling approaches, perform vanilla fine-tuning during meta-testing (i.e., task attention, neural scheduling, or conflict resolving mechanism is not employed during meta-testing), resulting in comparable test time (15-20 seconds on 300 tasks for MAML 5-way 1- and 5-shot setups). We add this detailed discussion in section 7.2.4 in the supplementary material. We hope that the revised manuscript meets the expectations of the TMLR journal. Thanks,
Assigned Action Editor: ~Yingnian_Wu1
Submission Number: 615
Loading