Navigating Trustworthiness of Deep Learning in ∆∆G prediction : Addressing Data Bias, Model Evaluation, and Interpretation

Ruochi Zhang; Ningning Chen; Fengfeng Zhou; Xin Gao

Navigating Trustworthiness of Deep Learning in ∆∆G prediction : Addressing Data Bias, Model Evaluation, and Interpretation

Ruochi Zhang, Ningning Chen, Fengfeng Zhou, Xin Gao

Published: 17 Jun 2024, Last Modified: 16 Jul 2024ML4LMS PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Protein-protein interactions, PPI, Data Bias, model evaluation and interpretation

Abstract: Artiﬁcial intelligence has emerged as an epicenter of global attention, given the rapid proliferation of cutting-edge AI tools. One promising avenue of application is the leveraging of deep learning methodologies to resolve complex biological conundrums. However, an essential question arises about the reliability and utility of deep learning models in the context of biosciences, where experimental data are often limited, especially in comparison to the vast data troves available in other domains. In this work, we focus on the task of identifying the change of binding afﬁnity (∆∆G) induced by mutations in protein-protein interaction, exploring the impact of the data bias, the methods of model evaluation and interpretation. Surprisingly, we ﬁnd that deep learning models may only learn the unintentional bias in the dataset instead of intrinsic principles, therefore proper data analysis and model evaluation should be applied not just focusing on improving the evaluation metrics. Our work provides a guideline to navigate the trustworthiness challenges in deep learning in bioscience and brings forth suggestions for future improvements.

Poster: pdf

Submission Number: 51

Loading