Contextual Tokenization for Graph Inverted Indices

Published: 23 Oct 2025, Last Modified: 28 Oct 2025LOG 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: graph indexing, graph retrieval, graph representation learning
TL;DR: We design graph inverted indices for graph retrieval.
Abstract: Retrieving graphs that contain a query subgraph is a core operation in molecular search, program analysis, and scene graph retrieval. Existing methods either rely on single-vector dense embeddings, which are efficient but coarse, or multi-vector neural alignments, which are accurate but require exhaustive corpus scoring. We propose CoRGII (COntextual Representation of Graphs for Inverted Indexing), a framework that bridges these extremes by learning discrete, contextual graph tokens that can be indexed with classical inverted indices. Our contributions include (i) a differentiable graph tokenizer that discretizes node embeddings, (ii) a query-aware, trainable impact weighting mechanism, and (iii) co-occurrence based multi-probing for balancing recall and efficiency. Extensive experiments show that CoRGII provides better trade-offs between accuracy and efficiency, compared to several baselines.
Submission Type: Extended abstract (max 4 main pages).
Submission Number: 129
Loading