Temporal Protein Evolution Prediction for Proactive Pathogen Surveillance

06 Sept 2025 (modified: 16 Oct 2025)Submitted to NeurIPS 2025 2nd Workshop FM4LSEveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein evolution, time to sequence mapping, protein language model, biosurveillance
TL;DR: PEVO introduces a neural framework that learns direct mappings from temporal embeddings to protein sequence embeddings for evolutionary prediction.
Abstract: Current pathogen biosurveillance systems reactively monitor evolving sequences from environmental samples and assess mutation risks to guide public health responses. To enable proactive surveillance, we developed PEVO, a deep learning framework that explores direct prediction of future protein evolution by learning mappings between temporal and sequence representations. Our approach combines TOTEM (Time-Ordered Evolutionary Modeling) embeddings to capture quarterly temporal patterns with ESM (Evolutionary Scale Modeling) protein sequence embeddings, training a neural network to map from TOTEM time space to ESM sequence space. We demonstrate our method on Ebola virus L protein sequences collected from 1976-2018 (n=2343) with quarterly temporal binning, generating predictions for 2019-2030 and validating on known sequences after 2019 (n=596). While our TOTEM-to-ESM mapping achieved an MSE of 0.002 (RMSE = 0.045), phylodynamics baseline outperformed our approach with an RMSE of 0.007. Additionally, our model's prediction variations (RMSE = 0.045) exceed the natural variability observed in training data (STD = 0.0292), indicating room for improvement in capturing evolutionary constraints. Despite current limitations, this work establishes a foundational framework for temporal-sequence learning that could potentially complement traditional phylodynamic approaches. With further refinement, such neural approaches may offer computational advantages for automated biosurveillance systems, contributing to the transformation of pathogen surveillance from reactive monitoring toward proactive preparedness for future pandemic threats.
Submission Number: 75
Loading