Keywords: chemical foundation models, protein foundation models, low rank adaptation, olfaction, multi-modal model, computational neuroscience
TL;DR: We benchmark chemical foundation models for odorant-receptor binding prediction and introduce LORAX, a LoRA-based cross-attention model that outperforms existing approaches and yields more informative odorant representations.
Abstract: Featurizing odorants to enable robust prediction of their properties is difficult due to the complex activation patterns that odorants evoke in the olfactory system. Structurally similar odorants can elicit distinct activation patterns in both the sensory periphery (i.e., at the receptor level) and downstream brain circuits (i.e., at a perceptual level). Despite efforts to design features for odorants to better predict how they interact with the olfactory system, there is still no universally accepted way to featurize odorants. In this work, we demonstrate that feature-based approaches that rely on pre-trained foundation models to generate odorant representations $\textit{do not}$ significantly outperform classical hand-designed features on odorant-receptor binding tasks. Instead, we show that it is necessary to fine-tune these features to increase predictive performance. To show this, we introduce a new model that creates olfaction-specific representations: $\textbf{L}$oRA-based $\textbf{O}$dorant-$\textbf{R}$eceptor $\textbf{A}$ffinity prediction with $\textbf{CROSS}$-attention ($\textbf{LORAX}$). We compare existing chemical foundation model representations to hand-designed physicochemical descriptors using feature-based methods and identify large information overlap between these representations, highlighting the necessity of fine-tuning to generate novel and superior odorant representations. We show that LORAX produces a feature space more closely aligned with olfactory neural representation, enabling it to outperform existing models on predictive tasks.
Supplementary Material: pdf
Primary Area: foundation or frontier models, including LLMs
Submission Number: 20909
Loading