Spatial transcriptomics provides revolutionary insights into cellular interactions and disease development mechanisms by combining high-throughput gene sequencing and spatially resolved imaging technologies to analyze genes naturally associated with spatially variable tissue genes. However, existing methods typically map aggregated multi-view features into a unified representation, ignoring the heterogeneity and view independence of genes and spatial information. To this end, we construct a heterogeneous Graph guided Contrastive Learning (stGCL) for aggregating spatial transcriptomics data. The method is guided by the inherent heterogeneity of cellular molecules by dynamically coordinating triple-level node attributes through comparative learning loss distributed across view domains, thus maintaining view independence during the aggregation process. In addition, we introduce a cross-view hierarchical feature alignment module employing a parallel approach to decouple spatial and genetic views on molecular structures while aggregating multi-view features according to information theory, thereby enhancing the integrity of inter- and intra-views. Rigorous experiments demonstrate that stGCL outperforms existing methods in various tasks and related downstream applications. \end{abstract}
Abstract:
Primary Subject Area: [Content] Multimodal Fusion
Secondary Subject Area: [Content] Multimodal Fusion
Relevance To Conference: This work, by introducing the heterogeneous Graph guided Contrastive Learning (stGCL) method, significantly advances the field of multi-view and multimodal processing, particularly in spatial transcriptomics. The main contribution lies in its innovative approach to managing the inherent heterogeneity and view independence in spatial and genetic data. Traditional methods in spatial transcriptomics typically amalgamate features from different sources into a unified representation, often overlooking the distinct and view-specific nature of the data. This can lead to a loss of critical information and inaccuracies in data interpretation.
The stGCL method addresses this by embedding genetic and spatial information into joint potential distributions while respecting their unique distributions. This approach allows for the dynamic refinement of node attributes across different views, facilitating a more nuanced integration of multimodal data. Moreover, the cross-view hierarchical feature alignment module introduced in this framework ensures the preservation of view-specific structures during the feature aggregation process. This module enables a more accurate and faithful representation of the original data sources, enhancing the capability to handle and interpret complex multimodal datasets.
Supplementary Material: zip
Submission Number: 22
Loading