TCRGenesis: Generation of SIINFEKL-specific T-cell receptor sequences using autoregressive Transformer

Published: 13 Oct 2024, Last Modified: 01 Dec 2024AIDrugX PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: biological sequence design, protein design, protein generation
Abstract: Engineered T-cell therapies are a promising new approach for treating previously uncurable diseases. These therapies involve genetically modified T cells expressing custom T cell receptors (TCRs) that recognize antigens from cancer, virus-infected, or autoimmune cells. However, the identification or generation of suitable TCRs remains an unsolved challenge. Computational methods hold the potential to accelerate the development of TCRs binding towards target antigens. While the computational investigation of the TCR-epitope landscape has been mainly focused on binding prediction, synthetic TCR design has recently emerged as the next frontier. Here, we present a proof-of-concept study on generating full TCR sequences reactive to a fixed epitope $\textit{in silico}$. Towards this, we utilized a unique dataset comprising thousands of TCRs experimentally validated as reactive towards the model epitope-MHC complex SIINFEKL/H2-K$^b$ and a naive TCR background to train our autoregressive transformer model TCRGenesis. The model generated a repertoire of realistic TCRs as validated through various biophysical and sequence properties. Further, the sequences exhibited high binding scores according to a predictor specifically developed for evaluation. The generator inherently captured the rules governing binding towards SIINFEKL as its perplexity score assigned to real, unseen TCR sequences separates well between binding and non-binding TCRs, and the generated sequences resembled binders. This work marks one of the first steps in the full-sequence design of TCRs specific to an antigen $\textit{in silico}$, which we envision will accelerate the development of future immunotherapies and personalized medicine through rapid and reliable TCR synthesis.
Submission Number: 123
Loading