What do you learn from context? Probing for sentence structure in contextualized word representations

Ian Tenney; Patrick Xia; Berlin Chen; Alex Wang; Adam Poliak; R Thomas McCoy; Najoung Kim; Benjamin Van Durme; Samuel R. Bowman; Dipanjan Das; Ellie Pavlick

What do you learn from context? Probing for sentence structure in contextualized word representations

Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick

Published: 21 Dec 2018, Last Modified: 21 Apr 2024ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Contextualized representation models such as ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2018) have recently achieved state-of-the-art results on a diverse array of downstream NLP tasks. Building on recent token-level probing work, we introduce a novel edge probing task design and construct a broad suite of sub-sentence tasks derived from the traditional structured NLP pipeline. We probe word-level contextual representations from four recent models and investigate how they encode sentence structure across a range of syntactic, semantic, local, and long-range phenomena. We find that existing models trained on language modeling and translation produce strong representations for syntactic phenomena, but only offer comparably small improvements on semantic tasks over a non-contextual baseline.

Keywords: natural language processing, word embeddings, transfer learning, interpretability

TL;DR: We probe for sentence structure in ELMo and related contextual embedding models. We find existing models efficiently encode syntax and show evidence of long-range dependencies, but only offer small improvements on semantic tasks.

Code: [![Papers with Code](/images/pwc_icon.svg) 2 community implementations](https://paperswithcode.com/paper/?openreview=SJzSgnRcKX)

Data: [Billion Word Benchmark](https://paperswithcode.com/dataset/billion-word-benchmark)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:1905.06316/code)

9 Replies

Loading