Cells2Vec: Bridging the gap between experiments and simulations using causal representation learning

Published: 27 Oct 2023, Last Modified: 05 Dec 2023CRL@NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: Causal Representation Learning, Deep Learning, Biological Simulations, Model Calibration, Agent Based Modelling
TL;DR: We used Causal Representation Learning to infer biological simulation-specific parameters from real-world experiments, helping improve traditional model calibration.
Abstract: Calibration of computational simulations of biological dynamics against experimental observations is often a challenge. In particular, the selection of features that can be used to construct a goodness-of-fit function for agent-based models of spatiotemporal behaviour can be difficult (Yip et al. (2022)). In this study, we generate one-dimensional embeddings of high-dimensional simulation outputs using causal dilated convolutions for encoding and a triplet loss-based training strategy. We verify the robustness of the trained encoder using simulations generated by unseen input parameter sets. Furthermore, we use the generated embeddings to estimate the parameters of simulations using XGBoost Regression. We demonstrate the results of parameter estimation for corresponding time-series real-world experimental observations, identifying a causal relationship between simulation-specific input parameters and real-world experiments. Our regression approach is able to estimate simulation parameters with an average $R^2$ metric of 0.90 for model runs with embedding dimensions of 4,8,12 and 16. Model calibration led to simulations with an average cosine similarity agreement of 0.95 and an average normalized Euclidean similarity of 0.69 with real-world experiments over multiple model runs.
Submission Number: 19