Controlled maximal variability along with reliable performance in recurrent neural networks

Chiara Mastrogiuseppe; Rubén Moreno-Bote

Controlled maximal variability along with reliable performance in recurrent neural networks

Chiara Mastrogiuseppe, Rubén Moreno-Bote

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Reinforcement Learning, Computational Neuroscience, Neural Variability, Recurrent Neural Network, Maximum Occupancy Principle, Maximum Entropy Reinforcement Learning

TL;DR: Maximizing cumulative future action entropy allows recurrent neural networks to perform tasks while maximizing variability.

Abstract: Natural behaviors, even stereotyped ones, exhibit variability. Despite its role in exploring and learning, the function and neural basis of this variability is still not well understood. Given the coupling between neural activity and behavior, we ask what type of neural variability does not compromise behavioral performance. While previous studies typically curtail variability to allow for high task performance in neural networks, our approach takes the reversed perspective. We investigate how to generate maximal neural variability while at the same time having high network performance. To do so, we extend to neural activity the maximum occupancy principle (MOP) developed for behavior, and refer to this new neural principle as NeuroMOP. NeuroMOP posits that the goal of the nervous system is to maximize future action-state entropy, a reward-free, intrinsic motivation that entails creating all possible activity patterns while avoiding terminal or dangerous ones. We show that this goal can be achieved through a neural network controller that injects currents (actions) into a recurrent neural network of fixed random weights to maximize future cumulative action-state entropy. High activity variability can be induced while adhering to an energy constraint or while avoiding terminal states defined by specific neurons' activities, also in a context-dependent manner. The network solves these tasks by flexibly switching between stochastic and deterministic modes as needed and projecting noise onto a null space. Based on future maximum entropy production, NeuroMOP contributes to a novel theory of neural variability that reconciles stochastic and deterministic behaviors within a single framework.

Supplementary Material: zip

Primary Area: Neuroscience and cognitive science (neural coding, brain-computer interfaces)

Submission Number: 11170

Loading