Abstract: The application of an ensemble of neural networks is becoming an imminent tool for advancing state-of-the-art deep reinforcement learning
algorithms. However, training these large numbers of neural networks in the ensemble has an
exceedingly high computation cost which may become a hindrance in training large-scale systems.
In this paper, we propose DNS: a Determinantal
Point Process based Neural Network Sampler that
specifically uses k-DPP to sample a subset of neural networks for backpropagation at every training step thus significantly reducing the training
time and computation cost. We integrated DNS in
REDQ for continuous control tasks and evaluated
on MuJoCo environments. Our experiments show
that DNS augmented REDQ matches the baseline
REDQ in terms of average cumulative reward and
achieves this using less than 50% computation
when measured in FLOPS.
0 Replies
Loading