Exploring Self-Supervised Graph Learning in Literature-Based Discovery

Juncheng Ding, Wei Jin

Published: 01 Jan 2021, Last Modified: 30 Jul 2025ICHI 2021EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Literature-Based Discovery (LBD) aims to find associations, as one or more connecting concepts, of two “unrelated” concepts hidden in the literature. The most recent studies fulfilled the task by learning the representations of the concepts and using the representations’ similarities to decide whether a connecting concept is a reasonable one. However, these approaches cannot handle “complex” associations with multiple connecting concepts properly. To address this issue, we propose a neural network model LBDSetNet which can assign a “credibility” score to a plausible association with either one or more connecting concepts. By unifying both the literature and the candidate associations as bags of concepts, we can generate “less credible” literature and train LBDSetNet by enforcing it to distinguish the generated and original literature, overcoming the lack of labeled associations. We also propose a new double-margin cost function for better model training by generating “more credible” documents. We experiment to show that our model can find “complex” associations effectively and efficiently. Comparative experiments reveal that the LBDSetNet solution performs significantly better than the previously proposed models on “simple” associations. The double margin cost function also proves its advantage in model training.