Inducing Document Representations from Graphs: A Blueprint

Boshko Koloski; Marko Pranjic; Nada Lavrac; Blaž Škrlj; Senja Pollak

Inducing Document Representations from Graphs: A Blueprint

Boshko Koloski, Marko Pranjic, Nada Lavrac, Blaž Škrlj, Senja Pollak

01 Mar 2023 (modified: 01 Jun 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone

Keywords: graph neural networks, document embeddings, language models

TL;DR: GRAPHS OF DOCUMENTS AND WHERE TO FIND THEM?

Abstract: Representing textual documents in continuous numerical spaces is a crucial task in NLP. Early practitioners of NLP built their approach around capturing statistical patterns within documents and utilizing them as features in rich feature spaces. In contrast, contemporary state-of-the-art techniques leverage large neural networks and learn the document representations self-supervised. However, while these approaches excel at learning contextual word representations, they often overlook implicit document-to-document relations that can arise in real-world settings. We propose a blueprint method for constructing document representations that explicitly accounts for implicit document-to-document relations to address this issue.

4 Replies

Loading