Abstract: While 2D convolutional neural networks (CNNs) demonstrate outstanding performance on computer vision tasks, their computational costs remain high. This paper reduces computational costs by introducing a novel architecture that replaces spatial 2D CNN operations with two consecutive 1D depthwise separable CNN (DSC) operations. Although vision inputs are two-dimensional, these 1D DSCs perform operations on 1D vision inputs. The 1D DSCs are predicated on the assumption that the dataset supports convolution operations with little or no loss of training accuracy. Deep 1D DSCs still suffer from gradient problems when training deep networks. We modify the construction of 1D CNNs with residual connections to improve the performance of deep 1D CNN architectures and introduce our final novel architecture, residual 1D convolutional networks (RCNs) for 1D vision inputs. Extensive benchmark evaluation shows that RCNs achieve at least 1% higher performance with about \(77\%\), \(86\%\), \(75\%\), and \(34\%\) fewer parameters, and about \(75\%\), \(80\%\), \(67\%\), and \(26\%\) fewer flops than ResNets, wide ResNets, MobileNets, and SqueezeNexts on CIFAR benchmarks, SVHN, and Tiny ImageNet image classification datasets. Moreover, our proposed RCNs improve deep recursive residual networks performance with 94% fewer parameters on the image super-resolution dataset.
External IDs:dblp:conf/icpr/ShahadatM24
Loading