Alleviating Observation Bias via Causal-Invariant Meta-Learning for Unbalanced Incomplete Multi-view Clustering

Jiaqi Jin; Siwei Wang; Taichun Zhou; Zhibin Dong; Siqi Wang; Miaomiao Li; Xinwang Liu; En Zhu

Alleviating Observation Bias via Causal-Invariant Meta-Learning for Unbalanced Incomplete Multi-view Clustering

Jiaqi Jin, Siwei Wang, Taichun Zhou, Zhibin Dong, Siqi Wang, Miaomiao Li, Xinwang Liu, En Zhu

Published: 30 Apr 2026, Last Modified: 24 Jun 2026ICML 2026 regularEveryoneRevisionsBibTeXCC BY 4.0

Abstract: In real-world scenarios, multi-view data often exhibits significant imbalance in missing patterns across views, where observation rates vary substantially among different views. Such observation bias makes it difficult for cross-view associations learned from limited complete samples to generalize to incomplete samples, leading to challenging cross-view recovery. Meanwhile, observation bias acts as a confounder, causing clustering predictions to spuriously depend on low-missing-rate views. To address these challenges, we propose CIMLN, a novel **C**ausal-**I**nvariant **M**eta-**L**earning **N**etwork that alleviates observation bias for unbalanced incomplete multi-view clustering. The context-aware meta-generation module formulates view recovery as a meta-learning task, enabling rapid adaptation to incomplete samples by encoding global statistical relationships through context information. The causal-invariant structure learning module constructs counterfactual scenarios by artificially masking low-missing-rate views, enforcing clustering consistency across different observation patterns. Extensive experiments on eight benchmarks demonstrate the effectiveness of CIMLN. The code is available at https://github.com/jinjiaqi1998/CIMLN.

Lay Summary: In real life, data collected from multiple sources often has uneven missing information across different sources, with some sources having far fewer missing entries than others. This unfair missing distribution creates biased observation patterns. It makes it hard to use information from fully available data to fill in and recover incomplete data across different sources, and also leads the clustering results to overly rely on the data sources with almost no missing values. To solve these problems, we design a new method called CIMLN to reduce such observation bias and improve clustering performance for multi-view data with unbalanced and missing information. Our method can quickly adapt to data with missing entries by learning overall inherent relationships, and it simulates different missing situations to ensure stable and consistent clustering results no matter how the data is missing. We test our method on eight common datasets, and the results prove it works well. The code is open available at https://github.com/jinjiaqi1998/CIMLN.

Primary Area: General Machine Learning->Clustering

Keywords: Incomplete Multi-view Clustering; Unbalanced Missingness; Meta-Learning; Causal Inference

Originally Submitted PDF: pdf

Submission Number: 3212

Loading