Unlearning Virus Knowledge Toward Safe and Responsible Mutation Effect Predictions

22 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: protein language model, mutation effect prediction, AI safety
TL;DR: We propose a knowledge unlearning method to guide a pre-trained protein language model unlearn undesired virus knowledge.
Abstract: Pre-trained deep protein models have become essential tools in fields such as biomedical research, enzyme engineering, and therapeutics due to their ability to predict and optimize protein properties effectively. However, the diverse and broad training data used to enhance the generalizability of these models may also inadvertently introduce ethical risks and pose biosafety concerns, such as the enhancement of harmful viral properties like transmissibility or drug resistance. To address this issue, we introduce a novel approach using knowledge unlearning to selectively remove virus-related knowledge while retaining other useful capabilities. We propose a learning scheme, PROEDIT, for editing a pre-trained protein language model toward safe and responsible mutation effect prediction. Extensive validation on open benchmarks demonstrates that PROEDIT significantly reduces the model's ability to enhance the properties of virus mutants without compromising its performance on non-virus proteins. As the first thorough exploration of safety issues in deep learning solutions for protein engineering, this study provides a foundational step toward ethical and responsible AI in biology.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2472
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview