Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: CNN, attention, low-data regime, classification
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: In the rapidly evolving landscape of deep learning for computer vision, var-
ious architectures have been proposed to achieve state-of-the-art performance
in tasks such as object recognition, image segmentation, and classification.
While pretrained models on large datasets like ImageNet have been the corner-
stone for transfer learning in many applications, this paper introduces CAReNet
(Convolutional Attention Residual Network), a novel architecture that was trained
from scratch, in the absence of available pretrained weights. CAReNet incorpo-
rates a unique blend of convolutional layers, attention mechanisms, and residual
connections to offer a holistic approach to feature extraction and representation
learning. Notably, CAReNet closely follows the performance of ResNet50 on
the same training set while utilizing fewer parameters. Training CAReNet from
scratch proved to be necessary, particularly due to architectural differences that
render feature representations incompatible with those from pretrained models.
Furthermore, we highlight that training new models on large, general-purpose
databases to obtain pretrained weights requires time, accurate labels, and pow-
erful machines, which causes significant barriers in many domains. Therefore, the
absence of pretrained weights for CAReNet is not only a constraint but also an op-
portunity for architecture-specific optimization. We also emphasize that in certain
domains, such as space and medical fields, the features learned from ImageNet
images are vastly different and can introduce bias during training, given the gap
that exists between the domains of pretraining and the task of transfer learning.
This work focuses on the importance of architecture-specific training strategies
for optimizing performance and also demonstrates the efficacy of CAReNet in
achieving competitive results with a more compact model architecture. Experi-
ments were carried out on several benchmark datasets, including Tiny ImageNet,
for image classification tasks. Signifying a groundbreaking stride in efficiency
and performance, CAReNet not only outpaces ResNet50 by achieving a lead of
2.61% on Tiny-Imagenet and 1.9% on STL10, but it does so with a model that’s
nearly half the size of ResNet50. This impressive balance between compactness
and elevated accuracy highlights the prowess of CAReNet in the realm of deep
learning architectures.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7399
Loading