scCMIA: Self-supervised Dual Model for Mitigating Information Loss in Single-cell Cross-Modal Alignment

Xuanwei Lin; Penghen Hu; Hebing Chen; Ximeng Liu; Xiaochen Bo; Hao Li

scCMIA: Self-supervised Dual Model for Mitigating Information Loss in Single-cell Cross-Modal Alignment

Xuanwei Lin, Penghen Hu, Hebing Chen, Ximeng Liu, Xiaochen Bo, Hao Li

18 Sept 2025 (modified: 27 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Single-cell, Self-supervised, Alignment, Reconstruction, scRNA, scATAC

Abstract: Recent technological advances in single-cell sequencing have enabled simultaneous profiling of multiple omics modalities within individual cells. Despite these advancements, challenges such as high noise levels and information loss during computational integration persist. While existing methods align different modalities, they often struggle to balance alignment accuracy with the preservation of modality-specific information needed for downstream biological discovery. In this paper, we introduce scCMIA, a novel framework guided by Mutual Information (MI) principles that leverages a VQ-VAE architecture. scCMIA achieves robust cross-modal alignment in a unified discrete latent space while enabling high-fidelity reconstruction of the original data modalities. Crucially, our framework transforms the learned discrete representations into a tool for tangible biological discovery, allowing for the quantification of regulatory programs and cross-modal relationships. Our extensive experiments demonstrate that scCMIA achieves state-of-the-art performance across multiple datasets. Our code is available at: https://anonymous.4open.science/r/scCMIA-77E3.

Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)

Submission Number: 12979

Loading