TL;DR: Oracle guided enhancement of memory representations improves task performance
Abstract: Retrieval-augmented classification and generation models benefit from *early-stage fusion* of high-quality text-based metadata, often called memory, but face high latency and noise sensitivity. In extreme classification (XC), where low latency is crucial, existing methods use *late-stage fusion* for efficiency and robustness. To enhance accuracy while maintaining low latency, we propose MOGIC, a novel approach to metadata-infused oracle guidance for XC. We train an early-fusion oracle classifier with access to both query-side and label-side ground-truth metadata in textual form and subsequently use it to guide existing memory-based XC disciple models via regularization. The MOGIC algorithm improves precision@1 and propensity-scored precision@1 of XC disciple models by 1-2% on six standard datasets, at no additional inference-time cost. We show that MOGIC can be used in a plug-and-play manner to enhance memory-free XC models such as NGAME or DEXA. Lastly, we demonstrate the robustness of the MOGIC algorithm to missing and noisy metadata. The code is publicly available at [https://github.com/suchith720/mogic](https://github.com/suchith720/mogic).
Lay Summary: In a text classification task, given a query, we aim to predict the labels relevant to that query. In many cases, additional information about the query or the label is available (called metadata or memory) and can be leveraged to make better predictions. Currently popular methods, such as RAG, which retrieve this metadata, and augment it with the model input for classification or generation, tend to have high latency and are sensitive to noise.
We propose a two-stage technique to utilize this extra information while also meeting the latency constraints. First, we train a powerful oracle model that takes advantage of the metadata assuming a best-case scenario where this information is also available during inference. Then, we train an off-the-shelf classifier model as a disciple that learns to mimic the behaviour of the oracle, but under a more challenging scenario, wherein the metadata is not know apriori. In this way, we get the best of both worlds — A model that is both efficient and fast, while benefiting from the improved accuracy gained by learning from the powerful oracle model.
We observe that this training technique (which we call Metadata-infused Oracle Guidance for Improved Extreme Classification, or MOGIC) improves overall accuracy of any existing classifier, while also offering a novel approach to incorporating extra information into the classification settings.
Application-Driven Machine Learning: This submission is on Application-Driven Machine Learning.
Link To Code: https://github.com/suchith720/mogic
Primary Area: General Machine Learning->Representation Learning
Keywords: Extreme Classification, XC, XML, XMC, Transformers, Metadata, Auxiliary information, Information Retrieval, Regularization, Supervised Learning, Memory-based Models, Neural Networks
Submission Number: 10762
Loading