An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Scores
Abstract: Figuring out how neural language models comprehend syntax acts as a key to revealing how they understand languages. We systematically analyzed methods of extracting syntax from models, namely probing, and found five limitations yet widely exist in previous probing practice. We proposed a method that can directly extract labeled dependency trees from attention scores without training any network, while being able to calculate the mutual information (MI) in a mathematical-rigorous way. Compared with previous approaches, our method has a much simpler model, while being able to probe more complex dependency trees, providing much more fine-grained information about model explanation at the same time. We demonstrated our method’s effectiveness by systematically comparing it with a great many competitive baselines, and gained informative conclusions, shedding light on our method’s explanation potential. Our code is included in the "software" materials of the openreview system to keep anonymity, and we’ll make them publicly available upon publication.
Paper Type: Long
Research Area: Interpretability and Analysis of Models for NLP
Research Area Keywords: probing,feature attribution
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 6044
Loading