Abstract: Developing meta-learning algorithms that are un-biased toward a subset of training tasks often requires hand-designed criteria to weight tasks, potentially resulting in sub-optimal solutions. In this paper, we introduce a new principled and fully-automated task-weighting algorithm for meta-learning methods. By considering the weights of tasks within the same mini-batch as an action, and the meta-parameter of interest as the system state, we cast the task-weighting meta-learning problem to a trajectory optimisation and employ the iterative linear quadratic regulator to determine the optimal action or weights of tasks. We theoretically show that the proposed algorithm converges to an $\epsilon_{0}$-stationary point, and empirically demonstrate that the proposed approach out-performs common hand-engineering weighting methods in two few-shot learning benchmarks.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission:
We introduce the following major changes to address the main points identified by the reviewers:
- We split the Section Method into two subsections (Section 3.1 Task-weighting as a trajectory optimisation, and Section 3.2 Practical task-weighting method based on trajectory optimisation) for clarity.
- We move the Section Weight Visualisation from the Appendix to the main paper, making it a subsection of the Section Experiments.
- We add a new Section Ablation Studies, after the Section Experiments, to provide extensive analysis of the effect of some hyper-parameters used. In particular, we analyse the influence of the number of iLQR iterations, the length of the trajectory, the prior of the weighting vector $\mathbf{u}$ as well as training the three baselines more to have a fairer comparison, which takes more than 500 GPU-hour in total.
- We move the auxiliary lemmas in Section 4 to the Appendices to increase the readability of the Section Convergence Analysis.
- We add a brief description about the proposed method in the end of the Section Introduction to facilitate the understanding of our paper.
- We clarify further some prior studies in the Related Work (highlighted in Magenta) and move the Section Related Work to right before the Section Discussion and Conclusion.
- We provide an additional visualisation of shading plot requested by reviewer SZgw in Appendix J.
The changes above are all highlighted or annotated to facilitate the next round of reviews.
Supplementary Material: pdf
Assigned Action Editor: Marcello Restelli
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Number: 1145
Loading