Keywords: reinforcement learning, evolution, open-endedness, fitness, multi-objective optimization, resource-rationality, bounded-rationality, minimal criteria, novelty search
TL;DR: an open-ended evolutionary framework is used to address the three "dogmas" of RL and provide new avenues of research forward
Abstract: Three core assumptions in reinforcement learning (RL)—regarding the definition of agency, the objective of learning, and the scope of applicability of the reward hypothesis—have previously been highlighted as key targets for conceptual revision, with significant implications for theory and application. Here, we provide a framework, largely inspired by open-ended evolutionary theory, to view these three ``dogmas'' in new light. We address each of the original three assumptions and illuminate a number of adjacent concerns that were raised in connection with them. To render our arguments relevant to RL as a model of biological learning, we first establish that evolutionary dynamics can plausibly operate within living brains over the course of an individual’s lifetime, and are not confined to cross-generational processes alone. We then begin by tackling the second dogma and draw on a rich evolutionary understanding of adaption to enrich the adaptation-rather-than-search conception of learning advocated previously. We then address the third dogma regarding the limits of the reward hypothesis; we use insights from the role of fitness in evolutionary theory to shed light on the multi-objective versus scalar reward debate. Finally, after a discussion around the implications of these insights on the question of exploration in RL, we turn to the first—and arguably most fundamental of the three issues: the lack of a formal account of agency. We argue that unlike the other two problems, the evolutionary paradigm alone is not sufficient to resolve the problem of defining agency, but that it nevertheless points us in the right direction. We advocate for ideas from theories of origins of life as the ultimate source of agency and the evolutionary dynamics that we championed earlier. We argue that lessons from origin-of-life research, namely concerning the thermodynamics of sustenance and replication, prove relevant for a resource-constrained account of reinforcement learning by brains.
Submission Number: 38
Loading