Bidirectional Helmholtz Machines

Jorg Bornschein, Samira Shabanian, Asja Fischer, Yoshua Bengio

Feb 17, 2016 (modified: Feb 17, 2016) ICLR 2016 workshop submission readers: everyone
  • CMT id: 196
  • Abstract: Efficient unsupervised training and inference in deep generative models remains a challenging problem. One basic approach, called Helmholtz machine, involves training a top-down directed generative model together with a bottom-up auxiliary model that is trained to help perform approximate inference. Recent results indicate that better results can be obtained with better approximate inference procedures. Instead of employing more powerful procedures, we here propose to regularize the generative model to stay close to the class of distributions that can be efficiently inverted by the approximate inference model. We achieve this by interpreting both the top-down and the bottom-up directed models as approximate inference distributions and by defining the model distribution to be the geometric mean of these two. We present a lower-bound for the likelihood of this model and we show that optimizing this bound regularizes the model so that the Bhattacharyya distance between the bottom-up and top-down approximate distributions is minimized. We demonstrate that we can use this approach to fit generative models with many layers of hidden binary stochastic variables to complex training distributions and that this method prefers significantly deeper architectures while it supports orders of magnitude more efficient approximate inference than other approaches.
  • Conflicts: