Supplementary Material: zip
Keywords: agent, open-endedness, reinforcement learning, continual learning, lifelong learning
Abstract: This research project explores the hypothesis that, given a bounded number of steps in an environment, agents that most efficiently optimize their model of the environment are more likely to induce emergent intelligent behavior in a reward-free scenario. We refer to this as the optimal explorer hypothesis. The project aims to formalize and analyze this hypothesis, investigating its theoretical implications and connections to related areas such as open-ended learning and active inference. Building on this foundation, we will develop a practical implementation of an approximate "optimal explorer" agent by formulating it as a combinatorial optimization problem and leveraging established methods from the field. Finally, we will conduct extensive experiments to evaluate whether the proposed agent induces emergent behaviors in diverse and challenging environments.
Submission Number: 8
Loading