LLM OntologyRAG - Extending a Food-Agent with a Description Logic Knowledge Representation

Damir Cavar; Naishal Nehal Shah

LLM OntologyRAG - Extending a Food-Agent with a Description Logic Knowledge Representation

Damir Cavar, Naishal Nehal Shah

Published: 28 Apr 2026, Last Modified: 28 Apr 2026MSLD 2026 PosterEveryoneRevisionsCC BY 4.0

Keywords: Large Language Model, Ontology, Neuro Symbolic Reasoning, Retrieval Augmented Generation

TL;DR: Using an ontology as a Description Logic context for RAGs, extending food agents with the Food ontology to improve LLM responses and reasoning related to food and nutrition queries

Abstract: Large Language Models (LLMs) are known to hallucinate (Zhang, et al., 2023; Gao et al., 2024). Updating information or facts in the models requires complete retraining from scratch. Adding private or personal information to the model is limited. To remedy this situation, Retrieval-Augmented Generation (RAG) (Lewis, et al., 2020) has been proposed as a solution to minimize hallucinations in LLMs, provide updated facts and knowledge, and to allow for private and personal information to be accessed by LLMs. RAGs typically provide an extended context to an LLM given a specific user query. In RAG systems, the user query is intercepted before it is handed over to the LLM. A similarity search based on the query identifies related text segments in a database. Similar text segments are forwarded to the LLM as context along with the query, allowing the LLM to generate a response with higher priority. Such an architecture can reduce hallucinations and provide access to updated information or knowledge to an LLM. It also allows users to add their private texts to an LLM without having to include them in the training corpus of LLMs. In our approach, we extend the RAG architecture by adding a formal ontology, i.e., a formal definition of a specific domain using a Description Logic (DL) (Baader, et al., 2007) formalism, i.e., the Web Ontology Language (OWL). As a domain, we picked food and nutrition, focusing on the USDA FoodData Central databases. The USDA food data (USDA FoodData, 2026) is provided as relational databases and tables. We converted the relational database into an OWL format reflecting hierarchical relations in the concept taxonomy (or class hierarchy). The concept hierarchy includes general food items and specific products, as well as all the nutrients associated with the food items in the USDA database. The food items, ingredients, and nutrients are arranged in a hierarchy of hypernyms. Relations between the concepts (OWL classes) are represented as formal relations with specified domains and ranges. This structure facilitates Description Logic-based reasoning and allows users to identify food items by nutrition types or ingredient groups, and to see the related food items. The ontology can be queried using SPARQL, and it can be used partially or completely as the context in LLMs with a RAG-type of architecture. Hierarchical and graph-based arrangement of knowledge has been proven to be beneficial in RAGs (Peng et al., 2024; Huang et al., 2025). However, using ontologies and DL-based RAGs provides additional advantages. We provide results for experiments using: a.) LLMs to generate SPARQL queries from user queries to pull triple sets from the Food ontology; b.) select a subgraph from the Food ontology as a context for a user query in a RAG architecture; c.) LLMs that consume the entire Food ontology as a context to respond to a user query. State-of-the-art LLMs can process structured RDF in Turtle format directly and generate responses from ontologies. Subgraphs and triple sets can be provided as raw triple sets in RDF format, or as formulated sets of short sentences of subject-predicate-object tuples. We show how an OntologyRAG not only provides much better results and baseline text-based RAGs, but also useful reasoning capabilities that provide significantly better tools, for example, for summarization using hypernyms and semantic properties.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 36

Loading