Keywords: shortcut learning, text classification, generative classifiers
TL;DR: Contrary to popular belief, generative classifiers do not avoid shortcut solutions in text classification
Abstract: Generative text classifiers, which assign labels by modeling or approximating the joint distribution over inputs and labels, have recently regained attention due to strong low-sample performance and a growing perception that they are less prone to shortcut learning than discriminative classifiers. However, existing evidence for shortcut avoidance is often indirect, frequently conflates classifier formulation with architectural differences, and is largely drawn from non-text domains. We revisit this question for text classification using a tiered experimental design that separates controlled comparisons from model-family evaluations. In capacity-matched tabular settings, we compare discriminative MLPs against class-conditional MADE density models (${\sim}17$K vs.\ ${\sim}18$K parameters) and discriminative tabular transformers against autoregressive generative transformers---holding data, optimizer, and evaluation protocol fixed. In NLP settings, we evaluate discriminative, generative, and pseudo-generative model families (BERT, GPT-2) across stylized SST-2 shortcuts and CivilComments demographic shortcuts. Across all settings, generative classification is not inherently shortcut-averse: when spurious cues are highly available, pure generative classifiers obtain competitive average accuracy while suffering substantially worse worst-group accuracy. Because this pattern appears in both the capacity-matched controlled experiments and the NLP model-family experiments, it cannot be dismissed as an artifact of architecture or model size. Pseudo-generative variants often mitigate this behavior, suggesting that the interface between generative modeling and discriminative prediction is central to shortcut robustness.
Paper Type: Long (8 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 142
Loading