Towards Understanding Large-Scale Discourse Structures in Pre-Trained and Fine-Tuned Language ModelsDownload PDF

Anonymous

08 Mar 2022 (modified: 05 May 2023)NAACL 2022 Conference Blind SubmissionReaders: Everyone
Paper Link: https://openreview.net/forum?id=eiTCK6FDs_N
Paper Type: Long paper (up to eight pages of content + unlimited references and appendices)
Abstract: In this paper, we extend the line of BERTology work by focusing on the important, yet less explored, alignment of pre-trained and fine-tuned PLMs with large-scale discourse structures. We propose a novel approach to infer discourse information for arbitrarily long documents. In our experiments, we find that the captured discourse information is local and general, even across a collection of fine-tuning tasks. We compare the inferred discourse trees with supervised, distantly supervised and simple baselines to explore the structural overlap, finding that constituency discourse trees align well with supervised models, however, contain complementary discourse information. Lastly, we individually explore self-attention matrices to analyze the information redundancy. We find that similar discourse information is consistently captured in the same heads.
Presentation Mode: This paper will be presented in person in Seattle
Copyright Consent Signature (type Name Or NA If Not Transferrable): Patrick Huber
Copyright Consent Name And Address: UBC, 2329 West Mall, Vancouver, BC, Canada, V6T 1Z4
0 Replies

Loading