From Molecules to Perception: A Benchmark Dataset for AI in Sensory Science

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 2: Dataset Proposal Competition
Keywords: Sensory science, molecular sensory, food formulation, drug discovery
TL;DR: A Benchmark Dataset for AI in Sensory Science
Abstract: Sensory perception—how molecules taste, smell, and ultimately feel pleasant or unpleasant—plays a critical role in food formulation, cosmetics, and pharmaceuticals. Yet, while vast datasets exist for molecular bioactivity, large-scale, standardized datasets linking chemical structure to sensory attributes remain scarce. This absence is a major bottleneck: promising molecules are often discarded due to undesirable taste or odor, and current AI efforts rely on fragmented, small-scale data with limited predictive power. We propose the Molecular Sensory Dataset (MSD), an open resource designed to capture molecular taste, odor, intensity, and pleasantness at scale. MSD will integrate high-throughput instrumentation—including electronic nose/tongue arrays, n-noise spectrometry, and n-tonne olfactometry—with standardized sensory descriptors and calibrated human panel ratings of hedonic value. Importantly, the dataset will cover both single molecules and mixtures, reflecting real-world applications where synergistic and masking effects shape perception. By establishing a benchmark for AI-driven prediction, generation, and optimization of sensory properties, MSD promises to accelerate discovery across food, fragrance, and drug development, while advancing fundamental understanding of the chemistry of perception.
Submission Number: 128
Loading