An Information-Theoretic Parameter-Free Bayesian Framework for Probing Labeled Dependency Trees from Attention Score
Keywords: probing, attention score, dependency syntax, mutual information
TL;DR: A method being able to estimate attention score-syntactical dependency MI and reconstruct labeled dependency trees from probabalistic distributions, requiring no trainable networks.
Abstract: Figuring out how neural language models comprehend syntax acts as a key to revealing how they understand languages.
We systematically analyzed methods for finding syntax structures in models, namely _probing_, and found limitations yet widely exist in previous probing practice.
We proposed a method capable of estimating mutual information (MI) and extracting dependency trees from attention scores in a mathematical-rigorous way, requiring no additional network training effort.
Compared with previous approaches, it has a much simpler model, while being able to probe more complex dependency trees, also transparent for fine-grained explanation.
We tested our method on several open-source LLMs and demonstrated its effectiveness by systematically comparing it with a great many competitive baselines. Several informative conclusions can be drawn by further analysis of the results, shedding light on our method’s explanatory potential.
Our code is released at https://github.com/ChristLBUPT/IPBP.
Primary Area: interpretability and explainable AI
Submission Number: 17656
Loading