Navigating Trustworthiness of Deep Learning in ∆∆G prediction : Addressing Data Bias, Model Evaluation, and Interpretation

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Protein-protein interactions, PPI, Data Bias, model evaluation and interpretation
Abstract: Artificial intelligence has emerged as an epicenter of global attention, given the rapid proliferation of cutting-edge AI tools. One promising avenue of application is the leveraging of deep learning methodologies to resolve complex biological conundrums. However, an essential question arises about the reliability and utility of deep learning models in the context of biosciences, where experimental data are often limited, especially in comparison to the vast data troves available in other domains. In this work, we focus on the task of identifying the change of binding affinity (∆∆G) induced by mutations in protein-protein interaction, exploring the impact of the data bias, the methods of model evaluation and interpretation. Surprisingly, we find that deep learning models may only learn the unintentional bias in the dataset instead of intrinsic principles, therefore proper data analysis and model evaluation should be applied not just focusing on improving the evaluation metrics. Our work provides a guideline to navigate the trustworthiness challenges in deep learning in bioscience and brings forth suggestions for future improvements.
Poster: pdf
Submission Number: 51
Loading