Dynamic Mixture Embeddings for Contextual Meta-Reinforcement Learning

14 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: contextual meta-reinforcement learning, meta-reinforcement learning, reinforcement learning, variational autoencoder, representation learning
TL;DR: We introduce a belief-based contextual meta-RL method that learns a hierarchical Gaussian-mixture Variational Autoencoder to represent complex multimodal task structures.
Abstract: Contextual meta-reinforcement learning (meta-RL) relies on latent task embeddings to enable rapid adaptation when faced with an unknown task. However, most methodologies rely on unimodal priors, which lack the adaptive capacity to represent complex multimodal task structure, limiting performance when faced with non-parametric variation. We introduce Dynamic Mixture Embeddings (DME), a belief-based contextual meta-RL method that learns a hierarchical Gaussian-mixture Variational Autoencoder, in which mixture component parameters are conditioned on a high-level macro latent. This yields an adaptive mixture prior whose means and variances shift as more context is gathered, while training is further augmented with virtual tasks drawn from the adaptive prior. DME achieves state-of-the-art performance across the entire MetaWorld benchmark suite, designed to test adaptation under non-parametric variation.
Supplementary Material: zip
Primary Area: reinforcement learning
Submission Number: 5134
Loading