Are there identifiable structural parts in the sentence embedding whole?

Are there identifiable structural parts in the sentence embedding whole?

ACL ARR 2024 June Submission899 Authors

13 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Sentence embeddings from transformer models encode in a fixed length vector much linguistic information. We explore the hypothesis that these embeddings consist of overlapping layers of information that can be separated, and on which specific types of information -- such as information about chunks and their structural and semantic properties -- can be detected. We show that this is the case using a dataset consisting of sentences with known chunk structure, and two linguistic intelligence datasets, solving which relies on detecting chunks and their grammatical number, and respectively, their semantic roles, and through analyses of the performance on the tasks and of the internal representations built during learning.

Paper Type: Long

Research Area: Interpretability and Analysis of Models for NLP

Research Area Keywords: sentence embeddings, pretrained transformer models, electra

Contribution Types: Model analysis & interpretability

Languages Studied: English, French

Submission Number: 899

Loading