Optimizing Detection Techniques for High-Precision Icon Recognition in Sparse Feature Spaces

Monali Barbate; Sudeep Choudhary

Optimizing Detection Techniques for High-Precision Icon Recognition in Sparse Feature Spaces

Monali Barbate, Sudeep Choudhary

25 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: CNN, Feature Sparse Images, Contrastive Learning, Adversarial Training, Dynamic Margins, Attention Mechanisms

Abstract: CNNs usually work well when they can extract progressively higher-level features through the layers. In small, low-resolution images, the depth of feature extraction is limited, leading to a sparsity in the feature space. Icon detection presents a unique challenge due to the small, feature-sparse nature of the target images, which often results in limited discriminative features. To address this, we propose an icon detection model based on a Siamese network architecture. This approach draws inspiration from face recognition frameworks. The modified architectures are aimed at being well-suited for distinguishing subtle differences between icon pairs. Given the relatively sparse feature space of these icons compared to larger images, we explore several enhancements to improve performance. Key innovations include the integration of attention mechanisms to focus on informative features, multi-scale feature extraction for better detail capture and contrastive learning. We additionally employ adversarial training to enhance performance. Additionally, we investigate dynamic margins in metric learning to model icon similarities. Self-supervised pretraining and Neural Architecture Search are employed to further refine and optimize the network. Our comprehensive evaluation demonstrates significant improvements in icon detection, highlighting the effectiveness of these advanced techniques in handling small, feature-sparse image data. This solution offers a valuable advancement in high-precision icon recognition, with potential applications in user interface design, software development, and digital asset management.

Primary Area: applications to computer vision, audio, language, and other modalities

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4549

Loading