Resource Efficient Test-Time Training with Slimmable Network

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Test-Time Training, Resource Efficient, Slimmable Neural Network
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Test-Time Training (TTT), an innovative paradigm for enhancing a model's generalization in a specific future scenario, commonly leverages self-supervised learning to adapt the model to the unlabeled test data under distribution shifts. However, previous TTT methods tend to disregard resource constraints during the deployment phase in real-world scenarios and have two fundamental shortcomings. Firstly, they are obligated to retrain adapted models when deploying across multiple devices with diverse resource limitations, causing considerable resource inefficiency. Secondly, they are incapable of coping with computational budget variations during the testing stage. To tackle these issues, we propose a resource-adaptive test-time training framework called SlimTTT, which allows for the seamless switching of different sub-networks for adaptive inference. Furthermore, we discover that different width of sub-networks can capture different views of images and these views are complementary and beneficial to the ones created by data augmentation, which is widely used in TTT. To utilize these views, we introduce Width-enhance Contrastive Learning (WCL), Logits Consistency Regularization (LCR) and Global Feature Alignment (GFA) to promote representation consistency at both feature and prediction space in a self-supervised manner, enabling networks of different widths to excel in TTT tasks. Our proposed method, SlimTTT, has achieved state-of-the-art (SOTA) results across a variety of adaptation methods and four different datasets with varying backbones. Remarkably, despite a significant reduction in computational complexity - over 70% less than the current SOTA method - SlimTTT continues to deliver competitive performance, rendering it highly conducive for adoption in practice.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
Supplementary Material: zip
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4376
Loading