HyperRep: Hypergraph-Based Self-Supervised Multimodal Representation Learning

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: hypergraph learning; multimodal learning; self-supervised learning; representation learning; clustering
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Self-supervised representation learning on multimodal data plays a pivotal role in proficiently integrating and embedding information from various sources without the need for additional labeling. Notably, the majority of existing methods overlook the complex high-order inter- and intra-modality correlations characteristic of real-world multimodal data. In this paper, we introduce HyperRep, which combines the strength of hypergraph-based modeling with a self-supervised multimodal fusion information bottleneck principle. The former captures high-order correlations using hypergraphs to represent inter- and intra-modality relations, while the latter constrains the solution space, ensuring a more effective fusion of multimodal data. Our extensive experiments on four public datasets for three downstream tasks demonstrate HyperRep's superiority, as it consistently delivers competitive results against state-of-the-art methods.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 4269
Loading