Abstract: Link Prediction (LP) approaches based on Language Models (LMs) operate over the labels and descriptions of entities and relations in a KG, achieving LP performance competitive with state-of-the-art. Recent approaches have shown that incorporating a local graph neighborhood can improve the LP capabilities of LMs. These approaches usually sample a context from the neighborhood around a query triple randomly, thereby incorporating noise that might hinder the model in making correct predictions.
In this work, we derive an approximately optimal context for a given query under the assumption that we know the correct answer. This allows us to investigate the characteristics of such contexts and the impact of a good context on LP, thereby providing an approximate upper bound on the achievable performance when using optimal contexts. We provide evidence that the neighborhoods created through random sampling are often suboptimal and unnecessarily large.
Furthermore, we show that the potential improvements of using an optimal context can be significant. We conclude that research on context selection is an important step towards developing better LP models.
Paper Type: Long
Research Area: Machine Learning for NLP
Research Area Keywords: graph-based methods,knowledge-augmented methods
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Position papers
Languages Studied: english
Submission Number: 186
Loading