Constructive Universal Approximation Theorems for Deep Joint-Equivariant Networks by Schur's Lemma

15 May 2024 (modified: 06 Nov 2024)Submitted to NeurIPS 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Schur's lemma, deep neural network, joint-group-equivariant, universality, ridgelet transform
TL;DR: present a unified constructive universal approximation theorem for both shallow and deep neural networks based on the group representation theory
Abstract: We present a unified constructive universal approximation theorem covering a wide range of learning machines including both shallow and deep neural networks based on the group representation theory. Constructive here means that the distribution of parameters is given in a closed-form expression (called the *ridgelet transform*). Contrary to the case of shallow models, expressive power analysis of deep models has been conducted in a case-by-case manner. Recently, Sonoda et al. (2023a,b) developed a systematic method to show a constructive approximation theorem from *scalar-valued joint-group-invariant* feature maps, covering a formal deep network. However, each hidden layer was formalized as an abstract group action, so it was not possible to cover real deep networks defined by composites of nonlinear activation function. In this study, we extend the method for *vector-valued joint-group-equivariant* feature maps, so to cover such real networks.
Primary Area: Deep learning architectures
Submission Number: 16692
Loading