Metric-Learning Encoding Models Identify Processing Profiles of Linguistic Features in BERT’s RepresentationsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: We introduce Metric-Learning Encoding Methods (MLEMs) as a new approach to understand neural representations of sentences and their linguistic features (e.g., tense, subject person, object number). MLEMs are capable to detect both local and distributed representations.As a proof-of-concept, we apply MLEMs to neural representations extracted from BERT, and find that: (1)~there exists an order among linguistic features, which separate representations of sentences to different degrees in different layers; (2)~for some layers, neural representations are organized in a \emph{hierarchical} way, with clusters nested within larger clusters, separated by linguistic features at different scales;(3)~in some layers (most strikingly the middle layer five of BERT), linguistic features are strongly disentangled, that is, represented within distinct clusters of selective units;(4)~MLEMs are more robust to type-I errors compared to multivariate decoding methods and are superior to univariate encoding methods in predicting neural activity.Together, this demonstrates the utility of Metric-Learning Encoding Methods for studying how linguistic features are neurally encoded in language models and the advantage of MLEMs over traditional methods. MLEMs can be extended to other domains (e.g., vision) and to other neural systems, such as the human brain.
Paper Type: long
Research Area: Interpretability and Analysis of Models for NLP
Contribution Types: Model analysis & interpretability, Data resources
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview