Keywords: Open‑world inference, Few‑shot learning, Cross-Table prediction
Abstract: Building general-purpose models that can leverage information across diverse datasets remains challenging due to varying schemas, inconsistent semantics, and arbitrary feature orderings in real-world structured data. We introduce ASPIRE (\textbf{A}rbitrary \textbf{S}et-based \textbf{P}ermutation-\textbf{I}nvariant \textbf{R}easoning \textbf{E}ngine), a universal neural inference model that performs semantic reasoning and prediction over heterogeneous tabular data. ASPIRE combines two key innovations: (1) a permutation-invariant, set-based Transformer architecture that treats feature-value pairs as unordered sets, and (2) a semantic grounding module that leverages natural language descriptions, dataset metadata, and in-context examples to align features across different datasets. This design enables ASPIRE to process arbitrary collections of feature-value pairs from any dataset and make predictions for any specified target without requiring fixed schemas or feature orderings. Once trained on diverse datasets, ASPIRE generalizes to new inference tasks without additional tuning. Our experiments demonstrate substantial improvements: 24\% higher average F1 scores in few-shot classification and 71\% lower RMSE in regression tasks compared to existing tabular foundation models. Additionally, ASPIRE naturally supports cost-aware active feature acquisition, strategically selecting informative features under budget constraints for previously unseen datasets. These capabilities position ASPIRE as a significant step toward truly universal, semantics-aware inference over structured data, enabling models to leverage patterns across the vast universe of tabular datasets rather than being limited to isolated, schema-specific learning.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 9958
Loading