A Biology-Informed Similarity Metric for Simulated Patches of Human Cell MembraneDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Keywords: Applications, Cancer Biology, Autonomous Multiscale, Metric Learning, Similarity Metrics
Abstract: Complex scientific inquiries rely increasingly upon large and autonomous multiscale simulation campaigns, which fundamentally require similarity metrics to quantify "sufficient'' changes among data and/or configurations. However, subject matter experts are often unable to articulate similarity precisely or in terms of well-formulated definitions, especially when new hypotheses are to be explored, making it challenging to design a meaningful metric. Furthermore, the key to practical usefulness of such metrics to enable autonomous simulations lies in in situ inference, which requires generalization to possibly substantial distributional shifts in unseen, future data. Here, we address these challenges in a cancer biology application and develop a meaningful similarity metric for "patches" --- regions of simulated human cell membrane that express interactions between certain proteins of interest and relevant lipids. In the absence of well-defined conditions for similarity, we leverage several biology-informed notions about data and the underlying simulations to impose inductive biases on our metric learning framework, resulting in a suitable similarity metric that also generalizes well to significant distributional shifts encountered during the deployment. We combine these intuitions to organize the learned metric space in a multiscale manner, which makes the metric robust to incomplete and even contradictory intuitions. Our approach delivers a metric that not only performs well on the conditions used for its development and other relevant criteria, but also learns key temporal relationships from statistical mechanics without ever being exposed to any such information during training.
One-sentence Summary: A new framework for similarity metric learning for a crucial biology application
5 Replies

Loading