HGR-TabE: Universal Tabular Embeddings via Maximal Correlation Alignment

Published: 25 May 2026, Last Modified: 29 May 2026FMSD @ ICML 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: HGR Maximal Correlations, Hypergraph Transformers, Unified Tabular Embeddings
TL;DR: We propose HGR-TabE, a universal tabular embedding model that aligns heterogeneous cell types via HGR correlation and leverages hypergraph transformers to produce permutation-invariant representations that generalize across multiple tasks
Abstract: Universal text embedding models show that a single pretrained model can produce representations useful across tasks like classification, clustering, and retrieval. In contrast, tabular foundation models remain largely task-specific. We ask whether a single tabular embedding model can generalize across tasks. We propose HGR-TabE, an initial approach that first aligns heterogeneous table-cell representations into a shared space using Hirschfeld–Gebelein–Rényi (HGR) maximal correlation, capturing relationships between numerical and non-numerical values within rows. We then apply message passing via Hypergraph Transformer (All-Set Transformer modules) to preserve row and column permutation invariance. The model is trained entirely with self-supervision to learn consistent representations at the cell, row, column, and table levels. Without task-specific fine-tuning, it generates embeddings that perform well on row similarity, column similarity, and predictive tasks, demonstrating strong cross-task generalization compared to specialized models.
Submission Number: 52
Loading