Revisiting the expressiveness of CNNs: a mathematical framework for feature extraction

TMLR Paper2625 Authors

04 May 2024 (modified: 16 Jul 2024)Rejected by TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Over the past decade deep learning has revolutionized the field of computer vision, with convolutional neural network models proving to be very effective for image classification benchmarks. Given their widespread adoption, several works have attempted to analyze their expressiveness, and study the class of functions that they can realize. However, a fundamental theoretical questions remain answered: why can CNNs express discrete image classification functions that involve feature extraction? We address this question in this paper by introducing a novel mathematical model for image classification, based on feature extraction, that can be used to generate images resembling real-world datasets. We show that convolutional neural network classifiers can express a class of functions based on our simplified model of image classification datasets. In our proof, we construct piecewise linear functions that detect the presence of features, and show that they can be realized by a convolutional network.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Changes have been made throughout the paper to clarify the paper's contributions, and directions for future research.
Assigned Action Editor: ~Yunhe_Wang1
Submission Number: 2625
Loading