Keywords: Linear Inverse Problems, Transformers, In-context learning, Meta-learning, Bayesian Inference
Abstract: In-context learning is one of the surprising and useful features of large language models. How it works is an active area of research. Recently, stylized meta-learning-like setups have been devised that train these models on a sequence of input-output pairs $(x, f(x))$ from a function class using the language modeling loss and observe generalization to unseen functions from the same class. One of the main discoveries in this line of research has been that for several problems such as linear regression, trained transformers (TFs) learn algorithms for learning functions in context. We extend this setup to different types of linear-inverse problems and show that TFs are able to in-context learn these problems as well. Additionally, we show that TFs are able to recover the solutions in fewer-measurements than the number of unknowns, leveraging the structure of these problems and are in accordance with the recovery bounds. Finally, we also discuss the multi-task setup, where the TF is pre-trained on multiple types of linear-inverse problems at once and show that at inference time, given the measurements, they are able to identify the correct problem structure and solve the inverse problem efficiently.
Submission Number: 35
Loading