When Does a Low-Rank Bayesian Neural Network Certify Its Deterministic Center?
Keywords: Low-rank Bayesian neural networks, PAC-Bayes generalization bounds, margin bounds, deterministic center network, balanced factorization, structured variational inference, Gaussian variational posterior, non-identifiability, posterior-induced perturbations, rank-sensitive certification
TL;DR: Balanced factorization resolves factor-space non-identifiability in low-rank BNNs, yielding a representation-invariant PAC-Bayes margin certificate for the deterministic center via explicit conditions on the learned posterior scales.
Abstract: We study when a structured low-rank Gaussian variational posterior can certify a
deterministic predictor in a Bayesian neural network with factorized layers
$W_i=A_iB_i^\top$. The same low-rank Bayesian model gives rise to three natural
certification targets: the posterior Gibbs predictor, the posterior predictive
mean, and a deterministic center network. This paper focuses on the
deterministic-center route.
The main obstruction is factor non-identifiability: $(A_i,B_i)$ and
$(cA_i,c^{-1}B_i)$ induce the same weight matrix but different factor norms, so
a naive PAC-Bayes margin certificate in factor coordinates is representation
dependent. We resolve this by passing to balanced factors obtained from the
singular value decomposition of the center weights.
On this balanced factor space, we combine Neyshabur's PAC-Bayes margin
framework with rectangular Gaussian operator-norm bounds to derive explicit
perturbation budgets and a margin bound for a deterministic center network. In
the balanced Gaussian variational setting, we give sufficient
conditions on the learned posterior scales under which the variational posterior
itself serves as the certifying perturbation law, thereby yielding a PAC-Bayes
margin bound for the corresponding deterministic center network through the
actual posterior geometry rather than through an auxiliary perturbation chosen
only for analysis.
Under matched covariances, the resulting complexity term reduces to a sum of
nuclear-norm contributions from the center weights, yielding rank-sensitive
control and potentially sharper certificates when the intrinsic ranks are
small.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 42
Loading