Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors

Jingyang Ke; Feiyang Wu; Jiyi Wang; Jeffrey Markowitz; Anqi Wu

Inverse Reinforcement Learning with Switching Rewards and History Dependency for Characterizing Animal Behaviors

Jingyang Ke, Feiyang Wu, Jiyi Wang, Jeffrey Markowitz, Anqi Wu

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

TL;DR: We develop an novel inverse reinforcement learning framework that can model the history-dependent switching reward functions in complex animal behaviors

Abstract: Traditional approaches to studying decision-making in neuroscience focus on simplified behavioral tasks where animals perform repetitive, stereotyped actions to receive explicit rewards. While informative, these methods constrain our understanding of decision-making to short timescale behaviors driven by explicit goals. In natural environments, animals exhibit more complex, long-term behaviors driven by intrinsic motivations that are often unobservable. Recent works in time-varying inverse reinforcement learning (IRL) aim to capture shifting motivations in long-term, freely moving behaviors. However, a crucial challenge remains: animals make decisions based on their history, not just their current state. To address this, we introduce SWIRL (SWitching IRL), a novel framework that extends traditional IRL by incorporating time-varying, history-dependent reward functions. SWIRL models long behavioral sequences as transitions between short-term decision-making processes, each governed by a unique reward function. SWIRL incorporates biologically plausible history dependency to capture how past decisions and environmental contexts shape behavior, offering a more accurate description of animal decision-making. We apply SWIRL to simulated and real-world animal behavior datasets and show that it outperforms models lacking history dependency, both quantitatively and qualitatively. This work presents the first IRL model to incorporate history-dependent policies and rewards to advance our understanding of complex, naturalistic decision-making in animals.

Lay Summary: In the real world, animals don't act based on short-term goals alone. They switch between different objectives—like finding water, resting, or exploring—and use their past experiences to inform future decisions. However, most inverse reinforcement learning (IRL) methods, which aim to recover the underlying reward function (goal) from observed behavior demonstrations, assume behavior is driven by a single, static goal and ignore the influence of past actions. This limits their ability to capture the dynamic, adaptive nature of real animal behavior. We introduces SWIRL, a new IRL framework that captures both goal switching and history dependency. SWIRL recovers switching reward functions that reflect changing motivations and models how recent behavioral history influences decisions—offering a richer, more biologically plausible model of behavior. We evaluate SWIRL on both synthetic and real-world datasets of animal behavior and find SWIRL recovers the underlying goals and behavioral segments more accurately than prior IRL approaches, particularly those that ignore history. By revealing interpretable, history-aware structure in long-term naturalistic behavior, SWIRL provides a valuable tool for neuroscience research. It also offers insights for broader machine learning applications involving switching, context-dependent decision-making.

Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.

Link To Code: https://github.com/BRAINML-GT/SWIRL

Primary Area: Applications->Neuroscience, Cognitive Science

Keywords: neuroscience, decision-making, inverse reinforcement learning, naturalistic behavior

Submission Number: 7175

Loading