Keywords: self-supervised speech models, wav2vec 2.0 (XLSR/XLS-R) interpretability (probing & masking diagnostics) computational phonology ATR vowel harmony (Assamese)
Abstract: Self-supervised speech models (S3Ms) often achieve high accuracy on phonological probing tasks; however, this accuracy may not accurately reflect the acquisition of grammatical processes. In this paper, we present an interpretive analysis of wav2vec2.0's representations using Assamese Advanced Tongue Root (ATR) vowel harmony as a case study. Instead of treating probing accuracy as evidence of rule learning, we combine layerwise probing with a set of masking-based tests designed to distinguish between global feature agreement and structure-phonological computation. Comparing two multilingual wav2vec2.0 variants, we show that ATR features are linearly decodable in the intermediate layers (peaking at $\sim$80\% accuracy) and that models successfully encode within-word feature agreement and sensitivity to an opaque vowel (demonstrating strong blocking effects with $\sim$30\% accuracy drop). At the same time, these representations provide limited evidence for rule-governed properties, such as directionality and trigger specificity. This raises an important interpretability question: do models pass individual phonological tests without implementing a systematic generative process? Our overall observations demonstrate that high probing accuracy and task-specific masking tests can sometimes overstate grammatical competence. We argue that phonological processes provide a valuable benchmark for interpretability methods, highlighting the importance of evaluating constraint interactions rather than isolating properties when analyzing neural speech models.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: probing , knowledge tracing/discovering/inducing, robustness
Contribution Types: Model analysis & interpretability, Approaches to low-resource settings, Data resources, Data analysis
Languages Studied: Assamese
Submission Number: 10734
Loading