Learning to (Learn at Test Time)

22 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: learning to learn, test-time training
Abstract: For each unlabeled test instance, test-time training (TTT) performs self-supervised learning on this single instance before making a prediction. We parameterize the self-supervised task and optimize it on the training set, such that TTT improves the final prediction. This form of learning to learn works in standard benchmarks such as ImageNet. In the simplest case, TTT with only linear components can implement linear attention, therefore can be dropped into linear transformers as a TTT layer. To evaluate the prescriptive power of our framework, we substitute the linear model in each TTT layer with a neural network, using heuristics such as stochastic gradient descent and layer norm. This shows significant improvements in performance comparing to linear transformers, i.e. TTT with linear models.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4334
Loading