Ontology-based box embeddings and knowledge graphs for predicting phenotypic traits in Saccharomyces cerevisiae
Keywords: Knowledge graph, Box embeddings, GNN, Bioinformatics, Discovery science
TL;DR: Combining box embeddings of ontologies with a knowledge graph we make predictions about cell growth in yeast, as well as suggesting new scientific knowledge which is validated in a biological experiment.
Track: Knowledge Graphs, Ontologies and Neurosymbolic AI
Abstract: We present a method that uses graph neural networks (GNNs) to predict and interpret digenic deletion fitness in the yeast Saccharomyces cerevisiae from a knowledge graph (KG) with ontology-based box embeddings.
We construct the KG from community databases using terms defined in several ontologies. From the class hierarchies in the ontologies, box embeddings are learnt as low dimensional representations of the nodes in the graph, which are used together with GNNs to predict cell growth for digenic deletions from the KG. With this we show that high level qualitative information can be used to predict experimental data.
Prediction performance was improved when using box embeddings of ontologies to represent the nodes in the graph, compared to learning features specific for this task. This suggests that class hierarchies in ontologies contain useful information about the domains, which can be extracted in the training of the box embeddings. We also demonstrate that our model can generalise beyond the task it was trained for by evaluating it on higher order gene deletions.
Additionally, we apply model interpretability techniques to identify co-occurring edges critical for fitness. Our findings are further validated by a biological experiment that reveals an association between inositol utilization and osmotic stress resistance, emphasising the model’s potential to guide scientific discovery.
Paper Type: Long Paper
Software: https://github.com/filipkro/kg-box-emb
Submission Number: 41
Loading