Out-of-Distribution Validation for Bioactivity Prediction in Drug Discovery: Lessons from Materials Science

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS PosterEveryoneRevisionsBibTeXCC BY-SA 4.0
Keywords: Drug Discovery, Validation, Bioacitivity, Out-of-Distribution, Protein Target
TL;DR: Building on MLmethods popular in life sciences, we have adapted them for drug discovery, specifically focusing on assessing performance on out-of-distribution data
Abstract: Recent advances in machine learning for materials science have significantly improved the prediction of novel materials. Building on these methods, we have adapted them for drug discovery, specifically focusing on assessing performance on out-of-distribution data. We found this approach more effective than conventional cross-validation methods by employing k-fold n-step forward cross-validation (SFCV) for predicting small molecules. Additionally, we introduced two new metrics: discovery yield and novelty error. These metrics provide deeper insights into model applicability and prediction accuracy for drug-like molecules. Based on our findings, we recommend incorporating these metrics into state-of-the-art bioactivity prediction models for drug discovery.
Poster: pdf
Submission Number: 57
Loading