TL;DR: This paper presents path convolution (PathConv), a novel CNN design exclusively with 1D operations, achieving ResNet-level accuracy using only 1/3 parameters.
Abstract: Two-dimensional (2D) convolutional kernels have dominated convolutional neural networks (CNNs) in image processing. While linearly scaling 1D convolution provides parameter efficiency, its naive integration into CNNs disrupts image locality, thereby degrading performance. This paper presents path convolution (PathConv), a novel CNN design exclusively with 1D operations, achieving ResNet-level accuracy using only 1/3 parameters. To obtain locality-preserving image traversal paths, we analyze Hilbert/Z-order paths and expose a fundamental trade-off: improved proximity for most pixels comes at the cost of excessive distances for other sacrificed ones to their neighbors. We resolve this issue by proposing path shifting, a succinct method to reposition sacrificed pixels. Using the randomized rounding algorithm, we show that three shifted paths are sufficient to offer better locality preservation than trivial raster scanning. To mitigate potential convergence issues caused by multiple paths, we design a lightweight path-aware channel attention mechanism to capture local intra-path and global inter-path dependencies. Experimental results further validate the efficacy of our method, establishing the proposed 1D PathConv as a viable backbone for efficient vision models.
Lay Summary: Modern computer vision systems use complex 2D operations to analyze images, making them computationally expensive and resource-intensive. While 1D operations could be more efficient, they typically disrupt the natural neighborhood relationships between pixels, leading to poor performance.
We developed Path Convolution (PathConv), a novel approach that processes images using only 1D operations. We preserve local pixel relationships by carefully designing special pixel traversal paths and implementing a path-shifting technique. We adopt randomized rounding algorithms to show that just three carefully shifted paths provide better locality preservation than traditional raster scanning methods. We also created a lightweight attention mechanism that helps the system understand relationships within and between these paths.
Our PathConv models achieve a similar level of accuracy as standard methods (ResNet) while using only one-third of model parameters, which makes image processing more accessible for devices with limited computing power, potentially enabling more efficient computer vision capabilities on smartphones, IoT devices, and other resource-constrained systems.
Link To Code: https://github.com/tum-bgd/2025-ICML-1DPathConv
Primary Area: Deep Learning
Keywords: 1D convolution, Hilbert curve, Z-order curve, convolutional neural network, deep learning, computer vision
Flagged For Ethics Review: true
Submission Number: 9681
Loading