Keywords: representational similarity analysis, diagnostic classifier, phonology, phonemes, neural network, interpretability
TL;DR: We study representations of phonology in neural network models of spoken language with several variants of analytical techniques.
Abstract: Given the fast development of analysis techniques for NLP and speech
processing systems, few systematic studies have been conducted to
compare the strengths and weaknesses of each method. As a step in
this direction we study the case of representations of phonology in
neural network models of spoken language. We use two commonly applied
analytical techniques, diagnostic classifiers and representational
similarity analysis, to quantify to what extent neural activation
patterns encode phonemes and phoneme sequences. We manipulate two
factors that can affect the outcome of analysis. First, we investigate
the role of learning by comparing neural activations extracted from
trained versus randomly-initialized models. Second, we examine the
temporal scope of the activations by probing both local activations
corresponding to a few milliseconds of the speech signal, and global
activations pooled over the whole utterance. We conclude that
reporting analysis results with randomly initialized models is
crucial, and that global-scope methods tend to yield more consistent
results and we recommend their use as a complement to local-scope
diagnostic methods.
1 Reply
Loading