BERTology Meets Biology: Interpreting Attention in Protein Language Models

Jesse Vig; Ali Madani; Lav R. Varshney; Caiming Xiong; richard socher; Nazneen Rajani

BERTology Meets Biology: Interpreting Attention in Protein Language Models

Jesse Vig, Ali Madani, Lav R. Varshney, Caiming Xiong, richard socher, Nazneen Rajani

Published: 12 Jan 2021, Last Modified: 12 Oct 2025ICLR 2021 PosterReaders: Everyone

Keywords: interpretability, black box, computational biology, representation learning, attention, transformers, visualization, natural language processing

Abstract: Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. In this work, we demonstrate a set of methods for analyzing protein Transformer models through the lens of attention. We show that attention: (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We find this behavior to be consistent across three Transformer architectures (BERT, ALBERT, XLNet) and two distinct protein datasets. We also present a three-dimensional visualization of the interaction between attention and protein structure. Code for visualization and analysis is available at https://github.com/salesforce/provis.

One-sentence Summary: We analyze the internal representations of protein language models, and show that attention targets structural and functional properties of protein sequences.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Code: [![github](/images/github_icon.svg) salesforce/provis](https://github.com/salesforce/provis) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=YWtLZvLmud7)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/bertology-meets-biology-interpreting/code)

20 Replies

Loading