Low rank adaptation of chemical foundation models generate effective odorant representations

ICLR 2026 Conference Submission20909 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: chemical foundation models, protein foundation models, low rank adaptation, olfaction, multi-modal model, computational neuroscience
TL;DR: We benchmark chemical foundation models for odorant-receptor binding prediction and introduce LORAX, a LoRA-based cross-attention model that outperforms existing approaches and yields more informative odorant representations.
Abstract: Featurizing odorants to enable robust prediction of their properties is difficult due to the complex activation patterns that odorants evoke in the olfactory system. Structurally similar odorants can elicit distinct activation patterns in both the sensory periphery (i.e., at the receptor level) and downstream brain circuits (i.e., at a perceptual level). Despite efforts to design or discover features for odorants to better predict how they activate the olfactory system, we lack a universally accepted way to featurize odorants. In this work, we demonstrate that feature-based approaches that rely on pre-trained foundation models $\textit{do not}$ significantly outperform classical hand-designed features, but that targeted foundation model fine-turning can increase model performance beyond these limits. To show this, we introduce a new model that creates olfaction-specific representations: $\textbf{L}$oRA-based $\textbf{O}$dorant-$\textbf{R}$eceptor $\textbf{A}$ffinity prediction with $\textbf{CROSS}$-attention ($\textbf{LORAX}$). We compare existing chemical foundation model representations to hand-designed physicochemical descriptors using feature-based methods and identify large information overlap between these representations, highlighting the necessity of fine-tuning to generate novel and superior odorant representations. We show that LORAX produces a feature space more closely aligned with olfactory neural representation, enabling it to outperform existing models on predictive tasks.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 20909
Loading