References

[ALL+15]

A. Abdolmaleki, R. Lioutikov, N. Lua, L. Paulo Reis, J. Peters, and G. Neumann. Model-based relative entropy stochastic search. In Advances in Neural Information Processing Systems (NeurIPS), 153–154. 2015.

[AZN18]

O. Arenz, M. Zhong, and G. Neumann. Efficient gradient-free variational inference using policy search. In International Conference on Machine Learning (ICML). 2018.

[AZN20]

Oleg Arenz, Mingjun Zhong, and Gerhard Neumann. Trust-region variational inference with gaussian mixture models. Journal of Machine Learning Research, 21(163):1–60, 2020. URL: http://jmlr.org/papers/v21/19-524.html.

[GBR+12]

A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola. A kernel two-sample test. Journal of Machine Learning Research (JMLR), 13:723–773, March 2012.

[KNT+18]

Mohammad Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin Gal, and Akash Srivastava. Fast and scalable Bayesian deep learning by weight-perturbation in Adam. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 2611–2620. PMLR, 10–15 Jul 2018.

[Lic13]

M. Lichman. UCI machine learning repository. 2013. URL: http://archive.ics.uci.edu/ml.

[LKS19a]

Wu Lin, Mohammad Emtiyaz Khan, and Mark Schmidt. Fast and simple natural-gradient variational inference with mixture of exponential-family approximations. In International Conference on Machine Learning, 3992–4002. PMLR, 2019.

[LKS19b]

Wu Lin, Mohammad Emtiyaz Khan, and Mark Schmidt. Stein's lemma for the reparameterization trick with exponential family mixtures. arXiv preprint arXiv:1910.13398, 2019.

[LSK20]

Wu Lin, Mark Schmidt, and Mohammad Emtiyaz Khan. Handling the positive-definite constraint in the Bayesian learning rule. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, 6116–6126. PMLR, 13–18 Jul 2020.

[PTA+19]

J. Pajarinen, H.L. Thai, R. Akrour, J. Peters, and G. Neumann. Compatible natural gradient policy search. Machine Learning (MLJ), pages 1443–1466, 2019.

[PS08]

Jan Peters and Stefan Schaal. Natural actor-critic. Neurocomputing, 71(7-9):1180–1190, 2008.

[SMSM99]

Richard S Sutton, David McAllester, Satinder Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In S. Solla, T. Leen, and K. Müller, editors, Advances in Neural Information Processing Systems, volume 12. MIT Press, 1999. URL: https://proceedings.neurips.cc/paper/1999/file/464d828b85b0bed98e80ade0a5c43b0f-Paper.pdf.