Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

Ling Zhan; Zhen Li; Junjie Huang; Tao Jia

Accelerating Benchmarking of Functional Connectivity Modeling via Structure-aware Core-set Selection

Ling Zhan, Zhen Li, Junjie Huang, Tao Jia

Published: 26 Jan 2026, Last Modified: 06 May 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Functional Connectivity Benchmark, Core-set Selection, Network Modeling, Structure-aware Sampling

TL;DR: We frame functional connectivity benchmarking task as a ranking recommendation problem and propose a self-supervised core-set selection framework that achieves up to 23.2% higher ranking stability than baselines at a 10% sampling rate.

Abstract: Benchmarking the hundreds of functional connectivity (FC) modeling methods on large-scale fMRI datasets is critical for reproducible neuroscience. However, the combinatorial explosion of model–data pairings makes exhaustive evaluation computationally prohibitive, preventing such assessments from becoming a routine pre-analysis step. To break this bottleneck, we reframe the challenge of FC benchmarking by selecting a small, representative *core-set* whose sole purpose is to preserve the relative performance ranking of FC operators. We formalize this as a ranking-preserving subset selection problem and propose **S**tructure-aware **C**ontrastive **L**earning for **C**ore-set **S**election (**SCLCS**), a self-supervised framework to select these core-sets. **SCLCS** first uses an adaptive Transformer to learn each sample's unique FC structure. It then introduces a novel **S**tructural **P**erturbation **S**core (**SPS**) to quantify the stability of these learned structures during training, identifying samples that represent foundational connectivity archetypes. Finally, while **SCLCS** identifies stable samples via a top-$k$ ranking, we further introduce a **density-balanced sampling strategy** as a necessary correction to promote diversity, ensuring the final core-set is both structurally robust and distributionally representative. On the large-scale REST-meta-MDD dataset, **SCLCS** preserves the ground-truth model ranking with just 10% of the data, outperforming state-of-the-art (SOTA) core-set selection methods by up to 23.2% in ranking consistency (nDCG@k). To our knowledge, this is the first work to formalize core-set selection for FC operator benchmarking, thereby making large-scale operators comparisons a feasible and integral part of computational neuroscience. Code is publicly available on: [https://github.com/lzhan94swu/SCLCS](https://github.com/lzhan94swu/SCLCS)

Supplementary Material: zip

Primary Area: applications to neuroscience & cognitive science

Submission Number: 2808

Loading