Abstract: A central capability of intelligent agents is to be able to quickly learn new tasks or adapt to new scenarios by drawing upon prior related experience. Gradient based meta-learning has recently emerged as an effective approach for few-shot learning and fast adaptation. However, a key challenge in scaling these approaches for more sophisticated learning processes is the requirement of differentiating through the inner optimization. Using connections to the implicit gradient theorem, we develop a new meta-learning method that depends only on the solution to the inner level optimization and not the path taken by the optimizer, effectively decoupling the meta-gradient computation from choice of inner loop optimizer. As a result, this approach can handle large numbers of gradient steps without numerical instabilities and memory overflows, as well as higher order inner optimization algorithms like quasi-Newton methods.
Code Link: https://sites.google.com/view/imaml
CMT Num: 60
2 Replies
Loading