The Volume of Non-Restricted Boltzmann Machines and Their Double Descent Model ComplexityDownload PDF

Oct 19, 2020 (edited Nov 09, 2020)NeurIPS 2020 Workshop DL-IG Blind SubmissionReaders: Everyone
  • Keywords: Minimum description length, geometric volume, Boltzmann machine, double descent, generalization
  • TL;DR: A double descent behavior manifests in the minimum description length principle when considering the geometric volume of a fully-observed Boltzmann machine.
  • Abstract: The double descent risk phenomenon has received much interest in the machine learning and statistics community. Motivated through Rissanen's minimum description length (MDL) principle, and Amari's information geometry, we investigate how a double descent-like behavior may manifest by considering the $\log V$ modeling term - which is the logarithm of the model volume. In particular, the $\log V$ term will be studied for the general class of fully-observed statistical lattice models, of which Boltzmann machines form a subset. Ultimately, it is found that for such models the $\log V$ term can decrease with increasing model dimensionality, at a rate which appears to overwhelm the classically understood $\mathcal{O}(D)$ complexity terms of AIC and BIC. Our analysis aims to deepen the understanding of how the double descent behavior may arise in deep lattice structures, and by extension, why generalization error may not necessarily continue to grow with increasing model dimensionality.
3 Replies