Anchored Semantics: Augmenting Ontologies via Competency Questions, Self-Attention, and Predictive Graph Learning
Abstract: We propose a framework that enriches ontologies by leveraging competency questions and distant supervision. The process begins by using an LLM to extract domain-relevant entities from the questions, followed by incremental refinement through short definitions anchored to a predefined dictionary. These entities and their hierarchies, along with associated queries, are embedded using a fine-tuned Llama3.2:1b and further processed through a self-attention mechanism to create unified representations. A directed acyclic graph models the dependencies between entities, with additional nodes derived from frequent co-occurrences in queries. A Graph Attention Network (GAT) is used for stable link prediction, discovering latent semantic relationships. These links are then labeled with specific relation types using a fine-tuned RoBERTa module. Evaluations using datasets from HPC training sessions and OpenAlex abstracts show significant improvements in link prediction and ontology enrichment over standard GAT and GraphSage baselines.
Loading