Keywords: Predictive coding, energy based models, biologically plausible learning
Abstract: Predictive coding networks are neural models that perform inference through an iterative energy minimization process. While effective in shallow architectures, they suffer significant performance degradation beyond five to seven layers. In this work, we show that this degradation is caused by exponentially imbalanced errors between layers during weight updates, and the predictions from the previous layers not being effective in guiding updates in deeper layers. Furthermore, when training models with skip connections, the energy propagated by the residuals reaches higher layers faster than the one propagated by the main pathway, affecting test accuracy. We address the first issue by introducing a novel precision-weighted optimization of latent variables that balances error distributions during the relaxation phase, the second issue by proposing a novel weight update mechanism that reduces error accumulation in deeper layers, and the third one by using identity nodes that slow down the propagation of the energy in the residual connections. Empirically, our methods achieve performance comparable to backpropagation on deep models such as ResNet18, opening new possibilities for predictive coding in complex tasks.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 18101
Loading