Imbalances in Neurosymbolic Learning: Characterization and Mitigating Strategies

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: neurosymbolic learning, learning theory, learning imbalances
Abstract: We study one of the most popular problems in **neurosymbolic learning** (NSL), that of learning neural classifiers given only the result of applying a symbolic component $\sigma$ to the gold labels of the elements of a vector $\mathbf x$. The gold labels of the elements in $\mathbf x$ are unknown to the learner. We make multiple contributions, theoretical and practical, to address a problem that has not been studied so far in this context, that of characterizing and mitigating *learning imbalances*, i.e., major differences in the errors that occur when classifying instances of different classes (aka **class-specific risks**). Our theoretical reveals a unique phenomenon: that $\sigma$ can greatly impact learning imbalances. This result sharply contrasts with previous research on supervised and weakly supervised learning, which only studies learning imbalances under data imbalances. On the practical side, we introduce a technique for estimating the marginal of the hidden gold labels using weakly supervised data. Then, we introduce algorithms that mitigate imbalances at training and testing time by treating the marginal of the hidden labels as a constraint. We demonstrate the effectiveness of our techniques using strong baselines from NSL and long-tailed learning, suggesting performance improvements of up to 14\%.
Supplementary Material: zip
Primary Area: General machine learning (supervised, unsupervised, online, active, etc.)
Submission Number: 17018
Loading