OrthCaps: An Orthogonal CapsNet with Sparse Attention Routing and Pruning

24 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: Deep Learing, Capsule Network, Orthogonality, Pruning
TL;DR: We propose the Orthogonal Capsule Network (OrthCaps) to reduce redundancy, improve routing performance and decrease parameter count.
Abstract: Redundancy is a persistent challenge in Capsule Networks (CapsNet), leading to high computational costs and parameter counts (Jeong et al., 2019; Sharifi et al., 2021; Renzulli & Grangetto, 2022). Although previous works have introduced pruning after the initial capsule layer, dynamic routing’s iterative and fully connected nature reintroduces inefficiencies and redundancy in deeper layers. In this paper, we propose the Orthogonal Capsule Network (OrthCaps) to reduce redundancy, improve routing performance and decrease parameter count. Specifically, an efficient pruned capsule layer is placed to discard redundant capsules and dynamic routing is replaced with orthogonal sparse attention routing. Besides, we orthogonalize weight matrices during routing to ensure feature diversity and sustain low capsule similarity, the idea of which is inspired by the application of orthogonality in Convolutional Neural Networks (CNNs). Moreover, a novel activation function named Capsule ReLU is proposed to address vanishing gradients. Our experiments on baseline datasets affirm the efficiency and robustness of OrthCaps in classification tasks, in which ablation studies validate the criticality of each component. Remarkably, with only 110k parameters, merely 1.25% of a standard Capsule Network’s total, OrthCaps-Shallow outperforms state-of-the-art (SOTA) benchmarks on four datasets, while OrthCaps-Deep attains nearly SOTA accuracy with 1.2% of its parameters on four datasets. The code is available at https://github.com/ornamentt/Orthogonal-Capsnet
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8808
Loading