A conformational benchmark for optical property prediction with solvent-aware graph neural networks

Denis Potapov, Sergei Rogovoi, Kuzma Khrabrov, Konstantin Ushenin, Alexey Korovin, Anton Ber, Artur Kadurin, Artem Tsypin

Published: 18 Feb 2026, Last Modified: 18 Mar 2026Communications ChemistryEveryoneRevisionsCC BY-SA 4.0
Abstract: Accurately predicting optical spectra of molecules is essential for creating better OLED emitters, solar-cell dyes, and fluorescent probes. Traditional methods, such as time-dependent density-functional theory, are computationally expensive and often inaccurate. Current Graph Neural Network (GNN) approaches for optical properties prediction are faster and offer better performance. Still, they operate on 2D graphs and ignore the 3D geometrical features that control excited-state behavior. We present nablaColors-3D, a rigorously curated dataset for the prediction of optical properties consisting of 26369 chromophore-solvent pairs with three conformations optimized at different levels of quantum theory. Based on this dataset, we establish a scaffold-split benchmark for 3D GNNs and systematically quantify how the fidelity of geometry optimization affects accuracy. Furthermore, we propose a solvent-aware modification for pretrained SE(3)-invariant architectures. Our best model, built on UniMol+, achieves MAE of 15.97 nm on a held-out test set, improving the previous state of the art by more than 30%. Current graph neural networks (GNNs) for the prediction of optical properties in molecules operate on 2D graphs, potentially overlooking 3D geometrical features underlying excited-state behaviour. Here, the authors present nablaColors, a curated dataset for the prediction of optical properties consisting of 26,369 chromophore-solvent pairs with three conformations optimized at different levels of theory, establishing a scaffold-split benchmark for 3D GNNs, and propose a solvent-aware modification for pretrained SE(3)-invariant GNN architectures.
Loading