Scaling High-Throughput Experimentation Unlocks Robust Reaction-Outcome Prediction

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Deep learning, organic chemistry, high throughput experimentation
TL;DR: Scaling high-throughput experimentation unlocks robust reaction-outcome prediction
Abstract: Organic chemistry underpins small-molecule drug discovery, yet—unlike structural biology—it lacks large, unbiased datasets for training broadly generalizable models. We report the largest microliter-scale high-throughput experimentation (HTE) campaign to date: $200{,}000$ reactions spanning three workhorse classes (Amide Coupling, Suzuki Coupling, Buchwald–Hartwig Coupling) involving $30{,}000$ unique products—over $4\times$ larger than the largest publicly disclosed HTE dataset to date. This scale and diversity enable reaction-outcome predictors that generalize to unseen substrates. We introduce UniReact, a molecule-attention Transformer built on pretrained molecular encoders. Across the three reaction classes, our models achieve PR-AUC $2$--$3\times$ higher than a random baseline and ROC-AUC in the $70$--$86\%$ range. We further establish scaling laws for reaction-outcome prediction of HTE data. In a human study on Suzuki Coupling prioritization, our models outperform PhD-level chemists (precision $87.1\%$ at $50\%$ recall vs.~$60.8\%$). Finally, we show the first, to the best of our knowledge, demonstration of zero-shot transfer to an external HTE dataset. Taken together, these results support scaled HTE as a viable path to broadly applicable predictors of chemical reactivity that surpass human intuition and ultimately help discover novel chemistry.
Submission Number: 253
Loading