Learning to Recover from Failures using Memory

Tao Chen; Pulkit Agrawal

Learning to Recover from Failures using Memory

Tao Chen, Pulkit Agrawal

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: memory, meta learning, learn from failures

Abstract: Learning from past mistakes is a quintessential aspect of intelligence. In sequential decision-making, existing meta-learning methods that learn a learning algorithm utilize experience from only a few previous episodes to adapt their policy to new environments and tasks. Such methods must learn to correct their mistakes from highly-correlated sequences of states and actions generated by the same policy's consequent roll-outs during training. Learning from correlated data is known to be problematic and can significantly impact the quality of the learned correction mechanism. We show that this problem can be mitigated by augmenting current systems with an external memory bank that stores a larger and more diverse set of past experiences. Detailed experiments demonstrate that our method outperforms existing meta-learning algorithms on a suite of challenging tasks from raw visual observations.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: We propose a meta-learner that use a memory bank and attend over early lifetime experience, especially failures, to improve the learning efficiency and testing performance.

Reviewed Version (pdf): https://openreview.net/references/pdf?id=NdnhVDMlKz

5 Replies

Loading