Learning Convolutional Representations via Generalized Stein’s Method: A Training-Free Approach

12 Sept 2025 (modified: 17 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Representation learning, Convolutional Neural Networks, Multi-index model, Score function, Singular value decomposition
TL;DR: A statistical approach for learning convolutional representations without training a CNN—efficient, theory-backed, and effective for medical imaging.
Abstract: Convolutional Neural Networks (CNNs) have revolutionized computer vision, with the convolution operation serving as a cornerstone that enables the extraction of abstract features and the discovery of hidden structures in image data. However, CNNs training typically relies on gradient descent, which can be computationally expensive and unstable, particularly in high-dimensional, small-sample settings such as medical imaging analysis. This paper presents an efficient statistical approach to learn convolutional representations without training a CNN. We reformulate CNNs into a general index model with matrix-valued inputs, interpreting convolution filters as index vectors while absorbing subsequent layers into the link function. Through a generalized version of the first-order Stein's formula, we develop a novel singular value decomposition (SVD) based approach to estimate the convolution filters directly. Theoretical analysis suggests that our estimation achieves an optimal convergence rate, comparable to that of generalized linear models where the link function is known. Extensive simulations and medical imaging experiments demonstrate the effectiveness of our approach, providing a viable pathway for representation learning.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 4292
Loading