Machine learning-guided design of biomolecular condensates in an automated laboratory
Keywords: ML-guided design, Bayesian Optimization, Automated laboratory, Bimolecular condensate
Abstract: Biomolecular condensates are phase-separated cellular compartments that regulate signaling, stress responses, and molecular sequestration. Designing synthetic condensates with specified phase behavior and material properties remains difficult due to a complex, context-dependent sequence–property landscape in human cells. Here we present a generative ML–guided design–build–test–learn loop that couples experimental measurements with machine learning to discover condensate-forming sequences. We begin with a domain-expert seed library and perform high-throughput live-cell confocal imaging across multiple cell cycles. An automated image-processing pipeline extracts functionally relevant properties, including saturation concentration, size distribution, and morphology, producing a curated sequence–property dataset. Then, we fit a multi-output Gaussian Process surrogate and use Bayesian Optimization (BO) to propose new candidate sequences, closing the loop between computation and experimentation. Our approach effectively reduces the number of iterations needed to achieve an optimal design over expensive-to-evaluate functions such as the sequence-property landscape for biomolecule condensates The work contributes experimental results, a reusable benchmark dataset, and a practical strategy for generative ML in biomolecular proteomics.
Presenter: ~Peiran_Jiang1
Format: Maybe: the presenting author will attend in person, contingent on other factors that still need to be determined (e.g., visa, funding).
Funding: No, the presenting author of this submission does not fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 82
Loading