Abstract: We introduce a new class of Bayesian Neural Networks (BNNs) which capture (Bayesian) uncertainty in predictions by exploiting the uncertainty about the underlying training-data-generation-distribution via treating it as a random variable distributed according to Bayesian nonparametric priors on the space of distribution functions, i.e. Dirichlet Processes (DPs). We show that these DP based BNNs provide a generalized Bayesian framework for designing randomized value-function based deep reinforcement learning (RL) algorithms. Crucially, RL with DP-BNNs enables to introduce a "prior" mechanism in a principled Bayesian manner. In the past, such a "prior" mechanism has been shown to be decisive (Osband et al., 2018) in the success of randomized-value function based deep-RL algorithms, and a principled Bayesian procedure remained unknown.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Vincent_Fortuin1
Submission Number: 8988
Loading