AnyPredict: A Universal Tabular Prediction System Based on Large Language Models

Sunstella 2023 Summer Research Camp Submission17 Authors

15 Jun 2023 (modified: 22 Jun 2023)Sunstella 2023 Summer Research Camp SubmissionEveryoneRevisions
Keywords: Tabular Prediction, Large Language Models, Machine Learning
Abstract: This proposal presents AnyPredict, a novel universal tabular prediction system based on Large Language Models (LLMs). Tabular data analysis is a fundamental aspect of research and industry, but the lack of a universal pipeline for interpreting and modeling data across different tables poses a significant challenge. Leveraging the impressive capabilities of LLMs, AnyPredict aims to address this limitation by converting tabular data into machine-understandable prompts and fine-tuning LLMs to perform accurate predictions. The proposed methodology involves an input processor module to convert tabular data into prompts and the fine-tuning of GPT-2, a widely-used LLM, to accomplish the prediction task. By combining these elements, AnyPredict demonstrates the potential of LLMs in overcoming the challenges associated with universal tabular predictions.
Submission Number: 17