Reframing Retreival-Augmented Generation for *in silico* optimization of antibody solubility

Lena Erlach; Rohit Singh; Bonnie Berger; Sai T. Reddy

Reframing Retreival-Augmented Generation for in silico optimization of antibody solubility

Lena Erlach, Rohit Singh, Bonnie Berger, Sai T. Reddy

Published: 06 Mar 2025, Last Modified: 26 Apr 2025GEMEveryoneRevisionsBibTeXCC BY 4.0

Track: Machine learning: computational method and/or computational results

Nature Biotechnology: Yes

Keywords: Antibody optimization, in silico developability optimization, Retrieval-Augmented Generation, Protein Language Models

TL;DR: We introduce a novel adaptation of Retrieval-Augmented Generation (RAG) for optimizing antibody solubility *in silico* that enables control over balancing optimization while retaining antibody functionality.

Abstract: Antibodies are successful biotherapeutics used for the treatment of various diseases. Throughout their therapeutic development, antibody candidates require optimization for drug developability, while retaining their functionality. This task remains a significant challenge as it is constrained by low-throughput experimental measurements. Retrieval Augmented Generation (RAG) was developed in natural language processing to generate more accurate text responses combining a retriever, a generator and a knowledge database. Here, we present a novel adaptation of this framework for the developability optimization of antibodies. Using solubility as a proof-of-concept, we demonstrate that this framework generates optimized antibody sequences with improved solubility scores, when evaluated *in silico*. This RAG framework allows precise control over the optimization process with the aim of preserving functionality of the antibody candidate. Moreover, the modular design enables adaptability across diverse optimization campaigns using a generalizable knowledge database, which has the potential to substantially reduce experimental efforts required for antibody developability optimization.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Presenter: ~Lena_Erlach1

Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.

Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.

Submission Number: 64

Loading