Development of Interpretable Machine Learning Models for COVID-19 Drug Target Docking Scores Prediction

Published: 01 Jan 2023, Last Modified: 20 Jul 2024BIBM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: With the extensive time and financial requirements incumbent on drug discovery, computational approaches, such as protein-ligand docking predictions, are increasingly crucial for accelerating the process of drug repurposing. However, the proliferation of identified protein targets has exposed a critical knowledge gap in developing robust models that offer both generalizability and interpretability for docking score prediction. Addressing this, our study presents a machine learning-based surrogate model, employing interpretable artificial intelligence techniques for accurate docking score prediction for SARS-CoV-2 protein targets. We demonstrate the model generalization on its expansion to accommodate unseen protein targets by integrating protein target information through feature concatenation. Moreover, we leverage the SHapley Additive exPlanations (SHAP) method to identify the data-driven feature importance of molecular substructures for knowledge-based validation. Our experiments reveal that the combination of data-driven prediction and knowledge-driven validation could provide biomedical insights into the interactions between drugs and SARS-CoV-2 proteins, elucidating their consequent effects on docking scores.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview