MELON: Multimodal Learning Framework for Spatial Multimodal Omics Data Integration

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: Spatial Multi-Omics, Spatial Transcriptomics, Multimodal Representation Learning, Contrastive Learning, Fusion Transformer, Cross-Modal Synergy, Spatial Domain Identification, Computational Biology, Self-Supervised Learning
Abstract: Spatial multi-omics technologies measure multiple molecular modalities on the same tissue section, but existing integration methods optimize for shared structure and rarely preserve the cross-modal synergistic signal that emerges only from joint observation across modalities. We present MELON, a representation-learning framework built around a partial-information-decomposition (PID)-guided contrastive objective that explicitly preserves redundant, unique, and synergistic cross-modal information, com- bined with a learned neighborhood-aware spatial bias that respects local tissue structure. On a controlled simulation isolating cross-modal synergy, MELON recovers a synergy-only label well above chance while seven multi-omics baselines remain at chance. On three real-data benchmarks spanning RNA–ATAC and RNA–protein settings, MELON achieves consistently higher agreement with anatomical reference labels than seven established baselines and produces more spatially contiguous domains with sharper boundaries; a trimodal tonsil extension confirms that these gains transfer beyond the bi-modal setting.
Submission Number: 241
Loading