Keywords: Latent Causal Models, Multimodal Learning, Identifiability, Contrastive Learning
Abstract: Directed acyclic graphs (DAGs) are often assumed in causal discovery, however, accurately identifying these DAGs necessitates various assumptions, particularly in latent causal models, which can be challenging to validate in real-world applications. This raises a critical question: Are DAG assumptions truly necessary for certain applications? In this work, we introduce a novel latent partial causal model for multimodal data, which features two latent coupled variables, connected by an undirected edge, effectively representing transferable knowledge across different modalities. We focus on a prominent learning framework, e.g., multimodal contrastive learning, and demonstrate that, with certain statistical assumptions, multimodal contrastive learning successfully identifies the latent coupled variables up to trivial transformation. This finding enhances our understanding of the mechanisms driving the success of multimodal contrastive learning. Furthermore, this finding reveals a unique potential for disentanglement in multimodal contrastive representation learning, improving the utility of pre-trained models like CLIP that are trained using this approach. Through experiments with synthetic data, we demonstrate the robustness of our findings, even in the presence of violated assumptions. In addition, we validate the disentanglement capabilities of pre-trained CLIP in learning disentangled representations, facilitating few-shot learning and improving domain generalization across a diverse range of real-world datasets.
Primary Area: causal reasoning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5639
Loading