Bayesian neural networks with Dirichlet process priors for reinforcement learning

TMLR Paper8988 Authors

17 May 2026 (modified: 29 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We introduce a new class of Bayesian Neural Networks (BNNs) which capture (Bayesian) uncertainty in predictions by exploiting the uncertainty about the underlying training-data-generation-distribution via treating it as a random variable distributed according to Bayesian nonparametric priors on the space of distribution functions, i.e. Dirichlet Processes (DPs). We show that these DP based BNNs provide a generalized Bayesian framework for designing randomized value-function based deep reinforcement learning (RL) algorithms. Crucially, RL with DP-BNNs enables to introduce a "prior" mechanism in a principled Bayesian manner. In the past, such a "prior" mechanism has been shown to be decisive (Osband et al., 2018) in the success of randomized-value function based deep-RL algorithms, and a principled Bayesian procedure remained unknown.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Vincent_Fortuin1
Submission Number: 8988
Loading