ProgramTab: Boosting Table Reasoning of LLMs via Programmatic Paradigm

ACL ARR 2025 February Submission2339 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Table-based reasoning with large language models (LLMs), which requires reasoning based on natural language questions and structured tabular data, has gained widespread attention. However, a series of issues still constrain the application of this task. The previous approaches suffered from significant performance degradation when faced with large tables due to the difficulty of long text modeling and the limitation of input length for LLMs. The text-to-SQL approach is used to efficiently extract key information from tables and generate smaller sub-tables. However, tabular data, especially web tables, often lack the necessary structure and consistency, making them unsuitable for performing mathematical logic operations using SQL queries. We propose the ProgramTab framework, which guides LLMs employing in-context learning to perform tabular data preprocessing with Python code, as well as the momentous contents extraction with row and column extraction and SQL generation. Data preprocessing includes defining the data format and type based on the different questions. The experiment results on WikiTQ and TabFact datasets demonstrate that the ProgramTab framework effectively deals with table-based reasoning tasks and outperforms all LLM-based baselines.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: Large Language Models, Table Reasoning
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 2339
Loading