Assessing Quantization and Efficient Fine-Tuning for Protein Language Models

Sebastian Clancy; Ilan Yaniv Zeisler; Pouriya Bayat; Matthew Xie; Vivian White; Spencer Perkins; Sepehr Bayat; Keith Pardee

Assessing Quantization and Efficient Fine-Tuning for Protein Language Models

Sebastian Clancy, Ilan Yaniv Zeisler, Pouriya Bayat, Matthew Xie, Vivian White, Spencer Perkins, Sepehr Bayat, Keith Pardee

Published: 06 Mar 2025, Last Modified: 26 Apr 2025GEMEveryoneRevisionsBibTeXCC BY 4.0

Track: Machine learning: computational method and/or computational results

Nature Biotechnology: Yes

Keywords: Machine Learning, Protein Language Model, LLM, Fine-Tuning, Quantization, 4-bit, PEFT, Protein Generation

TL;DR: Quantizing and parameter efficient fine-tuning maintains performance while reducing memory and power consumption.

Abstract: Proteins are essential to life and function, and discovering new proteins can unlock new therapeutics and industrial applications. However, the space of proteins is incredibly large and diverse, making discovering useful proteins difficult. Machine learning (ML) models help search the space of proteins, finding candidate proteins for specific goals and reducing the need for costly experimentation. The recent trend of increasing scale of ML models creates more demanding computational requirements, especially for large language models (LLMs) and their protein language model (PLM) counterparts. Quantization and efficient fine-tuning methods can help offset this by reducing the amount of memory and training required to use ML models. Here we show that combining 4-bit quantization and efficient training with low rank adapters maintains $>$90\% of the performance for most models in protein prediction tasks, while simultaneously reducing the required memory consumption by 46.7\% on average. Generative models that are 4-bit quantized use 76.4\% less memory while showing no significant difference in the quality of their generated proteins. This represents the first benchmark of quantized training with parameter efficient fine-tuning for PLMs while retaining nearly all of their performance, thus lowering the requirements and barrier of entry for practitioners.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Presenter: ~Sebastian_Clancy1

Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.

Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.

Submission Number: 95

Loading