CLASSIC: A platform for high throughput mapping of genetic design spaces in mammalian cells and ML guided prediction of gene circuit behavior

Published: 04 Mar 2024, Last Modified: 29 Apr 2024GEM PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Biology: datasets and/or experimental results
Cell: I do not want my work to be considered for Cell Systems
Keywords: High throughput data, genetic circuits, mammalian synthetic biology, machine learning, deep neural networks, predictive circuit design
TL;DR: We describe the development of CLASSIC, a platform for construction and high throughput characterization of massive libraries of large genetic compositions (>10kb) in mammalian cells
Abstract: Massively parallel genetic screens have been used to map sequence-to-function relationships for a variety of genetic elements. However, because these approaches can only interrogate short sequences, it remains challenging to perform high throughput (HT) assays on constructs containing combinations of multiple sequence elements arranged across multi-kb length scales. Overcoming this barrier could accelerate genetic design. For example, by screening diverse gene circuit designs, “composition-to-function” mappings could be created that provide insight into genetic part composability. Here, we introduce CLASSIC, a novel genetic screening platform that combines long- and short-read next-generation sequencing (NGS) modalities to quantitatively assess pools of constructs of arbitrary length containing diverse part compositions. We show that CLASSIC can measure expression profiles of >10$^{5}$ drug-inducible gene circuit designs (from 6-9 kb) in a single experiment in human cells. As we show, with a dataset of this size, it is possible to train machine learning (ML) models that not only predict the behavior of circuits from unmeasured regions of circuit design space, but also can be used as based models to expand design space mapping through an iterative active learning process. Furthermore, we show that by mapping entire circuit design landscapes, we gain critical insight into underlying circuit design and part composability principles that extend our understanding beyond standard biophysical models, and use active learning cycles to explore new part types and expand the total explored design space to ~10$^{6}$-10$^{7}$ members. Overall, our work shows that the expanded experimental throughput offered by CLASSIC dramatically augments the pace and scale of genetic design and establishes an experimental basis for AI-driven design of complex genetic systems.
Submission Number: 105
Loading