Benchmarking Uncertainty Quantification for Protein Engineering

Kevin P. Greenman; Ava Soleimany; Kevin K Yang

Benchmarking Uncertainty Quantification for Protein Engineering

Kevin P. Greenman, Ava Soleimany, Kevin K Yang

Published: 05 Apr 2022, Last Modified: 05 May 2023MLDD PosterReaders: Everyone

Keywords: protein engineering, protein design, uncertainty quantification, benchmark

TL;DR: We implement a panel of deep learning uncertainty quantification methods on the Fitness Landscape Inference for Proteins (FLIP) benchmark regression tasks.

Abstract: Machine learning sequence-function models for proteins could enable significant advances in protein engineering, especially when paired with state-of-the-art methods to select new sequences for property optimization and/or model improvement. Such methods (Bayesian optimization and active learning) require calibrated estimations of model uncertainty. While studies have benchmarked a variety of deep learning uncertainty quantification (UQ) methods on standard and molecular machine-learning datasets, it is not clear how well these results extend to protein datasets. In this work, we implement a panel of deep learning UQ methods on the Fitness Landscape Inference for Proteins (FLIP) benchmark regression tasks. We compare results across different degrees of distributional shift using metrics that assess each UQ method's accuracy, calibration, coverage, width, and rank correlation to provide recommendations for the effective design of biological sequences.

0 Replies

Loading