Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration

Chen Jiang; Jiahui An; Yating Liu; Ni Ji

Brain Bandit: A Biologically Grounded Neural Network for Efficient Control of Exploration

Chen Jiang, Jiahui An, Yating Liu, Ni Ji

Published: 22 Jan 2025, Last Modified: 16 May 2025ICLR 2025 OralEveryoneRevisionsBibTeXCC BY 4.0

Keywords: explore-exploit, stochastic Hopfield network, Thompson sampling, decision under uncertainty, brain-inspired algorithm, reinforcement learning

TL;DR: We demonstrate that a brain-inspired stochastic Hopfield network can achieve efficient, human-like, uncertainty-aware exploration in bandit and MDP tasks.

Abstract: How to balance between exploration and exploitation in an uncertain environment is a central challenge in reinforcement learning. In contrast, humans and animals have demonstrated superior exploration efficiency in novel environments. To understand how the brain’s neural network controls exploration under uncertainty, we analyzed the dynamical systems model of a biological neural network that controls explore-exploit decisions during foraging. Mathematically, this model (named the Brain Bandit Net, or BBN) is a special type of stochastic continuous Hopfield network. We show through theory and simulation that BBN can perform posterior sampling of action values with a tunable bias towards or against uncertain options. We then demonstrate that, in multi-armed bandit (MAB) tasks, BBN can generate probabilistic choice behavior with a flexible uncertainty bias resembling human and animal choice patterns. In addition to its high efficiency in MAB tasks, BBN can also be embedded with reinforcement learning algorithms to accelerate learning in MDP tasks. Altogether, our findings reveal the theoretical foundation for efficient exploration in biological neural networks and propose a general, brain-inspired algorithm for enhancing exploration in RL.

Primary Area: applications to neuroscience & cognitive science

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 12774

Loading