
Arguably, Large Language Models have revolutionized natural language processing, demonstrating unprecedented capabilities in text generation, reasoning, and complex task execution. 
However, these models bear a fundamental limitation, as their knowledge is implicitly encoded within billions of parameters, making it opaque, static, and prone to hallucinations: when dealing with factual information, LLMs may generate plausible-sounding but incorrect responses, struggle with recent events not present in their training data, or fail to provide verifiable sources for their claims.

We posit Knowledge Graphs (KGs) as a valid solution to these challenges \cite{giunchiglia2021itelos,bocca2024building}.
Unlike the parametric knowledge stored in LLMs, KGs provide \textbf{structured}, \textbf{interpretable}, and \textbf{updatable} representations of factual information \cite{giunchiglia2021stratified}. 
KGs organize knowledge as networks of entities connected by typed relationships, enabling precise querying, easy verification, and efficient updates. 
The central challenge, however, is bridging the representational gap between a graph and the token sequences a language model consumes.
This integration often relies on \textbf{textual serialization} of KG information, converting structured triples into natural language descriptions that are then concatenated with the input prompt. 
This approach suffers from severe token inefficiency: a single logical fact may consume dozens of tokens when expressed in natural language, and high computational costs.

Some studies proposed the use of small LoRA-like~\cite{hu2021loralowrankadaptationlarge} \textbf{adapters} to store new information in the model's parameters, or rely on trained Knowledge Graph Embedding (KGE) models and inject them into inner layers of the model;
however, these techniques require re-training for each new piece of information, which can lead to costly updates in dynamic environments.
An alternative solution relies on injecting knowledge as \textbf{continuous embeddings} rather than text.
Prior work has explored this research direction for standard graph-related tasks such as node classification and link prediction, while little attention has been given to using this approach to inject \textbf{factual} information into LLMs for natural language tasks such as question answering.
The few attempts have so far been validated only on small-scale language models~\cite{barmettler2025conceptformerefficientuseknowledgegraph}, leaving open the question of whether embedding-based discrete injection can yield improvements on modern, larger LLM backbones.

In this work, we propose the KoRe architecture, which advances the state-of-the-art by addressing 
three critical research gaps:
\begin{itemize}
    \item \textbf{Token-Efficient Knowledge Representation:} To overcome the context pressure of text-based injection, KoRe utilizes a Directional Residual Vector Quantization (RVQ) scheme. This compresses graph structures into a minimal sequence of discrete "knowledge tokens," achieving a dramatic reduction in token consumption while preserving the essential factual information required for accurate grounding.
    \item \textbf{Factual Conveyance via GNNs:} Unlike prior GNN-based approaches that focus on standard graph tasks, KoRe investigates whether GNN encodings can effectively transfer the \textit{factual content} of KGs to improve LLM accuracy in question answering.
    \item \textbf{Generalization to unseen entities:} Contrary to previous knowledge integration approaches, our proposed pipeline allows for the encoding of arbitrary knowledge graphs without requiring prior training of KGE models. We use a pretrained sentence-encoder to initialize node and edge embeddings with the intended semantics and let the GNN aggregate this information into a single representation.
\end{itemize}
