Unpaired-to-paired data synthesis: Learning to model disease effects via contrastive analysis of neuroimaging-derived features
Keywords: contrastive analysis, variational inference, synthetic data generation, radiomics, neuroimaging
TL;DR: I use a contrastive analysis strategy for synthetic paired data generation of neuroimaging derived features
Abstract: Advances in machine learning have enabled the analysis of complex, high-dimensional datasets, yet neuroimaging lags behind due to data privacy and sharing constraints. Synthetic data offers a promising solution for developing and training models. However, synthesizing disease-specific datasets is challenging, as neurological disorders induce progressive changes in the brain that are subtle and often obscured by normal brain variability. Contrastive analysis provides a framework to learn generative factors that deconvolve variation shared between background (e.g., healthy) and target (e.g., diseased) datasets from variation unique to the target, making it particularly effective for capturing as well as modeling subtle disease effects. In this paper, we reformulate this framework to synthesize tabular neuroimaging-derived features, specifically brain regional volumes from T1-weighted structural MRI. Given unpaired neuroimaging samples of healthy and diseased participants, we learn to generate paired healthy and disease feature representations that emulate real disease effects. We show that paired synthesis enables fine-grained, individual-level modeling of disease effects, improving downstream analyses, and supporting more precise exploration of disease heterogeneity. We validate the models on both semi-synthetic and real-world brain regional volume datasets, specifically designed to highlight the heterogeneity parsing capability of contrastive analysis. The models are available at: [link].
Primary Area: applications to neuroscience & cognitive science
Submission Number: 20827
Loading