Keywords: Category theory, deep learning, backpropagation, natural gradient, symmetric monoidal category, parameterized maps, neural architecture search, Fisher information
TL;DR: Backprop as contravariant functor, natural gradient as natural transformation; compositionality and uniqueness theorems; 23\% NAS improvement with stability guarantees.
Abstract: We develop a comprehensive categorical framework for deep learning that unifies neural network architectures, gradient computation, and optimization algorithms within a single mathematical structure. We model neural network architectures as morphisms in a symmetric monoidal category $\Para(\mathbf{C})$ of parameterized maps, formalize backpropagation as a contravariant functor to the category of gradient flows $\Grad(\mathbf{C})$, and characterize natural gradient descent as a natural transformation in the functor category. This framework yields three main contributions: (1) \textit{compositionality guarantees} proving that modular training equals end-to-end training precisely when categorical coherence conditions hold, (2) a \textit{uniqueness theorem} establishing that any gradient-based optimizer preserving functorial consistency and reparameterization invariance must be naturally isomorphic to Fisher natural gradient, and (3) \textit{new constraints on architecture search} derived from monoidal coherence conditions that systematically eliminate architectures prone to training instability. We validate these theoretical results through experiments demonstrating that categorical coherence constraints improve neural architecture search by 23\% over baseline methods on CIFAR-10 and ImageNet, with certified stability guarantees for modular training pipelines.
Submission Number: 158
Loading