Abstract: Insect-borne diseases kill >0.5 million people annually. Currently available repellents for personal or
household protection are limited in their efficacy, applicability, and safety profile. Here, we describe a
machine-learning-driven high-throughput method for the discovery of novel repellent molecules. To
achieve this, we digitized a large, historic dataset containing ~19,000 mosquito repellency
measurements. We then trained a graph neural network (GNN) to map molecular structure and
repellency. We applied this model to select 317 candidate molecules to test in parallelizable behavioral
assays, quantifying repellency in multiple pest species and in follow-up trials with human volunteers.
The GNN approach outperformed a chemoinformatic model and produced a hit rate that increased
with training data size, suggesting that both model innovation and novel data collection were integral
to predictive accuracy. We identified >10 molecules with repellency similar to or greater than the most
widely used repellents. This approach enables computational screening of billions of possible
molecules to identify empirically tractable numbers of candidate repellents, leading to accelerated
progress towards solving a global health challenge.
0 Replies
Loading