Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Mengwei Ren; Neel Dey; Martin Andreas Styner; Kelly Botteron; Guido Gerig

Local Spatiotemporal Representation Learning for Longitudinally-consistent Neuroimage Analysis

Mengwei Ren, Neel Dey, Martin Andreas Styner, Kelly Botteron, Guido Gerig

Published: 31 Oct 2022, Last Modified: 04 Aug 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: neuroimaging, medical image analysis, spatiotemporal representation learning, longitudinal learning, medical image segmentation

Abstract: Recent self-supervised advances in medical computer vision exploit the global and local anatomical self-similarity for pretraining prior to downstream tasks such as segmentation. However, current methods assume i.i.d. image acquisition, which is invalid in clinical study designs where follow-up longitudinal scans track subject-specific temporal changes. Further, existing self-supervised methods for medically-relevant image-to-image architectures exploit only spatial or temporal self-similarity and do so via a loss applied only at a single image-scale, with naive multi-scale spatiotemporal extensions collapsing to degenerate solutions. To these ends, this paper makes two contributions: (1) It presents a local and multi-scale spatiotemporal representation learning method for image-to-image architectures trained on longitudinal images. It exploits the spatiotemporal self-similarity of learned multi-scale intra-subject image features for pretraining and develops several feature-wise regularizations that avoid degenerate representations; (2) During finetuning, it proposes a surprisingly simple self-supervised segmentation consistency regularization to exploit intra-subject correlation. Benchmarked across various segmentation tasks, the proposed framework outperforms both well-tuned randomly-initialized baselines and current self-supervised techniques designed for both i.i.d. and longitudinal datasets. These improvements are demonstrated across both longitudinal neurodegenerative adult MRI and developing infant brain MRI and yield both higher performance and longitudinal consistency.

TL;DR: Given longitudinal neuroimages with scarce annotation, this paper develops a self-supervised spatiotemporal representation learning method and a consistency-regularization term for image-to-image networks.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/local-spatiotemporal-representation-learning/code)

10 Replies

Loading