Topic-aware Contextualized Transformers

Ruiying Lu; Bo Chen; Dan dan Guo; Dongsheng Wang; Mingyuan Zhou

Topic-aware Contextualized Transformers

Ruiying Lu, Bo Chen, Dan dan Guo, Dongsheng Wang, Mingyuan Zhou

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Abstract: Training on disjoint fixed-length segments, Transformers successfully transform static word embeddings into contextualized word representations. However, they often restrict the context of a token to the segment it resides in and hence neglect the flow of contextual information across segments, failing to capture longer-term dependencies beyond the predefined segment length. This paper uses a probabilistic deep topic model to provide contextualized embeddings at both the token and segment levels. It also introduces topic self-attention and a contextual next-word embedding guided topic select-attention, injecting contextualized topic information into Transformer-based architectures. Moving beyond conventional Transformers that ignore longer-range word dependencies and contextualize their word representations at the segment level, the proposed method not only captures global semantic coherence of all segments and global word concurrence patterns, but also enriches the representation of each token by adapting it to its local context, which is not limited to the segment it resides in and can be flexibly defined according to the task. Experiments on various corpora show that adding only a few extra parameters, the proposed topic-aware contextualized transformers consistently outperform their conventional counterparts, and can be used to generate coherent sentences and paragraphs.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=Zb8hrRkSvQ

4 Replies

Loading