Challenges in Leveraging Functional Information to Evaluate Predicted Protein-Ligand Interactions

Joseph G. Wakim; Jose Manuel Marti; Jonathan E Allen; Adam T. Zemla

Challenges in Leveraging Functional Information to Evaluate Predicted Protein-Ligand Interactions

Joseph G. Wakim, Jose Manuel Marti, Jonathan E Allen, Adam T. Zemla

Published: 06 Oct 2025, Last Modified: 06 Oct 2025NeurIPS 2025 2nd Workshop FM4LS PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Foundation Models, Deep Learning, Embeddings, Benchmarking, Protein-Ligand Interactions, Drug Discovery, Drug Selectivity, Protein Function

TL;DR: We introduce a framework for scoring drug selectivity and evaluating predicted protein-ligand interactions using functional information, demonstrating inconsistencies in protein embeddings and highlighting the need for more reliable representations.

Abstract: Protein-ligand interactions (PLIs) are fundamental to the efficacy and toxicity of drugs, and predicting these interactions with computational models can accelerate drug development. Given an uncharacterized protein and its predicted structure, putative interactions with ligands can be identified based on structural alignment with known binding pockets. However, the accuracy of these predictions depends on the reliability of the protein model. Functional information offers an observable comparator for evaluating predicted PLIs. Yet, existing methods for embedding protein function cluster proteins inconsistently; for the same protein pairs, their relative distances in a functional latent space can vary depending on the embedding method used. To assess challenges in scoring protein function similarity, we evaluate similarity scores using benchmarks that label protein pairs based on shared attributes. For example, we consider benchmarks that label proteins based on shared localization or disease associations, where positive examples share the attribute and negative examples do not. For each benchmark, we quantify how well popular protein representations differentiate between the positive and negative groups. We then demonstrate an innovative framework for leveraging functional similarity scores to characterize drug selectivity and evaluate predicted PLIs. We show that our function-based evaluations remain limited by uncertainty in similarity scores. Overall, we demonstrate the critical need for more reliable similarity-scoring metrics and present a framework for their use in evaluating predicted PLIs during computational drug development.

Submission Number: 49

Loading