Meta-Learning with Task-Environment Interaction

21 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Meta-learning, Task-Environment Interaction, Reward Calculation, Difficulty Assessment, Few-shot Classification
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: The goal of meta-learning is to learn a universal model from various meta-training tasks, enabling rapid adaptation to new tasks with minimal training. Currently, mainstream meta-learning algorithms randomly sample meta-training tasks from a task pool, and the meta-model treats these sampled tasks equally without discrimination, training on them as a whole. However, due to the limitations imposed by training computational power and time constraints, harmful tasks sampled from the imbalanced distribution can have a significant impact on the optimization of the meta-model.Therefore, this paper introduces a form of meta-learning called Task-Environment Interaction Meta-Learning(TIML), which is distinct from reinforcement learning with data preprocessing. In TIML, we create a Task Environment Interaction Mechanism that assesses the interaction between the meta-learning model and the presently sampled task environment. It conducts training differently based on factors such as task difficulty, rewards, harmfulness levels, and others, thereby altering the current practice of uniformly handling multiple tasks.By doing so, we can rapidly enhance the generalization and convergence of meta-learning parameters for unknown tasks. Experimental results demonstrate that the proposed TIML method achieves improvements in model performance while maintaining the same training time complexity. It exhibits faster convergence, greater stability, and can be flexibly combined with other models, showcasing its robust simplicity and universality.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3257
Loading