Multi-Objective Design of DNA-Stabilized Nanoclusters Using Variational Autoencoders With Automatic Feature Extraction
Abstract: DNA-stabilized silver nanoclusters (AgN-DNAs)
have sequence-tuned compositions and fluorescence colors.
High-throughput experiments together with supervised machine
learning models have recently enabled design of DNA templates
that select for AgN-DNA properties, including near-infrared
(NIR) emission that holds promise for deep tissue bioimaging.
However, these existing models do not enable simultaneous
selection of multiple AgN-DNA properties, and require
significant expert input for feature engineering and class
definitions. This work presents a model for multiobjective,
continuous-property design of AgN-DNAs with automatic
feature extraction, based on variational autoencoders (VAEs).
This model is generative, i.e., it learns both the forward mapping from DNA sequence to AgN-DNA properties and the inverse
mapping from properties to sequence, and is trained on an experimental data set of DNA sequences paired with AgN-DNA
fluorescence properties. Experimental testing shows that the model enables effective design of AgN-DNA emission, including
bright NIR AgN-DNAs with 4-fold greater abundance compared to training data. In addition, Shapley analysis is employed to
discern learned nucleobase patterns that correspond to fluorescence color and brightness. This generative model can be
adapted for a range of biomolecular systems with sequence-dependent properties, enabling precise design of emerging
biomolecular nanomaterials.
Loading