Math Formula Graph Retrieval Using Contrastive Learning Over Visual and Semantic Embeddings

Bryan Amador, Richard Zanibbi

Published: 11 Jul 2025, Last Modified: 29 Dec 2025ICTIR 2025EveryoneCC BY 4.0

Abstract: A key opportunity to make mathematical information more easily accessible is creating search engines that leverage relationships between visual and semantic formula representations. In this pa- per we introduce a formula retrieval model where visual and semantic embeddings are queried separately, but trained jointly. Embeddings are produced using two Relational Graph Convolutional Neural Networks (R-GCNs), which are jointly optimized using a self-supervised training task with a contrastive loss, where pairs of visual and semantic formula nodes are classified as being from the same formula or different formulas. To avoid information loss, we use node embeddings to retrieve visual and semantic formula graphs, with each scored separately using a greedy alignment between query and candidate nodes in the manner of ColBERT. We explore combining and selecting visual and semantic relevance scores in different ways. We present results for two math formula retrieval benchmarks, ARQMath and NTCIR-12. Results show comparable results against the state-of-the-art and significant improvement when combining visual and semantic information from formulas.