Abstract: Variational inference with natural-gradient descent often shows fast convergence in practice, but its theoretical convergence guarantees have been challenging to establish. This is true even for the simplest cases that involve concave log-likelihoods and use a Gaussian approximation. We show that the challenge can be circumvented for such cases using a square-root parameterization for the Gaussian covariance. This approach establishes novel convergence guarantees for natural-gradient variational-Gaussian inference and its continuous-time gradient flow. Our experiments demonstrate the effectiveness of natural gradient methods and highlight their advantages over algorithms that use Euclidean or Wasserstein geometries.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: As requested, we have reformulated the following statement:
> Our findings indicate that the success of these methods is heavily influenced by the choice of parameterization and the intrinsic geometry of the optimization landscape
to
> Our findings suggest that the effectiveness of these methods can be affected by the choice of parameterization and the underlying geometry of the optimization landscape; however, a definitive link has yet to be established.
Assigned Action Editor: ~Jan-Willem_van_de_Meent1
Submission Number: 3805
Loading