A light-weight backbone to adapt with extracting grouped dilation features

Published: 01 Jan 2025, Last Modified: 18 Jun 2025Pattern Anal. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Addressing grouped dilation features (GDFs) improved the learning ability of MobileNetV1 in image representation. However, the computational complexity is still at a high level, while the performance is a modest degree. This expensive cost is principally caused by the backbone of MobileNetV1 taking deep feature maps in several latest layers. To mitigate these issues, we propose a light-weight network (called CGDF-Net) with an adaptative architecture to effectively extract grouped dilation features. CGDF-Net is structured by two main contributions: (i) Its backbone is improved by simply replacing several latest layers of MobileNetV1 with a pointwise convolutional layer for reducing the computational complexity; (ii) Embedding an attention mechanism into the GDF block to form a completed GDF perceptron (CGDF) that directs the learning process into the significant properties of objects in images instead of the trivial ones. Experimental results on benchmark datasets for image recognition have validated that the proposed CGDF-Net network obtained good performance with a small computational cost in comparison with MobileNets and other light-weight models. For instance, CGDF-Net obtained 60.86% with 3.53M learnable parameters on Stanford Dogs, up to 6% better than MoblieNetV1-GDF (54.9%, 3.39M) and 9% versus MoblieNetV1 (51.6%, 3.33M). Meantime, the performance of CGDF-Net on ImageNet-100 is 85.22%, about 6%\(\sim\)8% higher than MobileNetV1-GDF’s (79.14%) and MobileNetV1’s (77.01%), respectively. The code of CGDF-Net is available at https://github.com/nttbdrk25/CGDFNet.
Loading