Abstract: A key opportunity to make mathematical information more easily
accessible is creating search engines that leverage relationships
between visual and semantic formula representations. In this pa-
per we introduce a formula retrieval model where visual and semantic
embeddings are queried separately, but trained jointly.
Embeddings are produced using two Relational Graph Convolutional
Neural Networks (R-GCNs), which are jointly optimized using a
self-supervised training task with a contrastive loss, where pairs
of visual and semantic formula nodes are classified as being from
the same formula or different formulas. To avoid information loss,
we use node embeddings to retrieve visual and semantic formula
graphs, with each scored separately using a greedy alignment
between query and candidate nodes in the manner of ColBERT.
We explore combining and selecting visual and semantic relevance scores
in different ways. We present results for two math formula retrieval
benchmarks, ARQMath and NTCIR-12. Results show comparable
results against the state-of-the-art and significant improvement
when combining visual and semantic information from formulas.
Loading