Scaling Multimodal Temporal Graphs with Event-Adaptive Compression and Sparse Connectivity

09 Sept 2025 (modified: 29 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal Learning, Temporal Graphs, Graph Neural Network, Representation Learning
Abstract: Multimodal temporal data analysis presents a challenge: it needs to strike a balance between high resolution for capturing sudden events and a wide temporal range for scalability. This often results in vast graph models that can be computationally intractable. Current approaches tend to either break the sequences into fixed-length segments or trim edges to stay within budget constraints, often at the cost of fidelity. We introduce EAMC–C2SG, a novel framework that dynamically compresses temporal streams into segments tailored to events and creates a sparse graph model that respects temporal ordering. By curbing the proliferation of nodes and edges, our design achieves strict budget control while reducing complexity from a quadratic to a near-linear scale with respect to sequence length. Our framework preserves valuable information in multimodal temporal data and, when tested on extensive clinical datasets (MIMIC-IV + CXR) and diverse cross-domain benchmarks (TimeMMD), achieves state-of-the-art predictive accuracy with markedly lower latency and memory usage. Beyond raw performance, EAMC–C2SG also offers interpretable segmentations and insightful graph diagnostics, making it a scalable and transparent solution for multimodal temporal learning.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 3247
Loading