Competence-Based Analysis of Language ModelsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: Despite the recent successes of large, pretrained neural language models (LLMs), little is known about the representations of linguistic structure they learn during pretraining, leading to unexpected behavior in response to small changes in inputs or application contexts. To better understand these models and behaviors, we propose a general analysis framework to move beyond traditional performance-based evaluation of LLMs and instead analyze them on the basis of their internal representations. Our framework, CALM (Competence-based Analysis of Language Models), is designed to study and measure the linguistic competence of LLMs in the context of specific tasks by intervening on models' internal representations of different linguistic properties using causal probing, and evaluating models' alignment under these interventions with a given ground-truth causal model of the task. We also develop a novel approach for performing causal probing interventions using gradient-based adversarial attacks, which can target a broader range of properties and representations than existing techniques. Finally, we carry out a case study of CALM using these interventions to analyze BERT and RoBERTa's competence across a variety of lexical inference tasks, showing that CALM can be used to explain and predict their behavior across these tasks.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability, Theory
Languages Studied: English
0 Replies

Loading