Abstract: Over the broad landscape of experimental design, regression has been a powerful tool to accurately predict the outcome metrics of a system or model given a set of parameters, but has been traditionally restricted to methods which are only applicable to a specific task. In this paper, we propose OmniPred, a framework for training language models as universal end-to-end regressors over $(x,y)$ evaluation data from diverse real world experiments. Using data sourced from a large proprietary blackbox optimization database, our extensive experiments demonstrate that through only textual representations of mathematical parameters and values, language models are capable of very precise numerical regression, and if given the opportunity to train over multiple tasks, can significantly outperform traditional regression models.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Updated text (noted in blue) following Reviewer PTPJ, WB8P, Wdzp, and pUMR's initial review and Action Editor's comment.
Assigned Action Editor: ~John_Timothy_Halloran1
Submission Number: 3105
Loading