Scalable Robust Bayesian Co-Clustering with Compositional ELBOS
Keywords: Variational Co-Clustering, Latent Space Modeling, Doubly Reparameterized ELBO, Noise-Robust Representation Learning, Mutual Information Regularization
TL;DR: We propose a VAE-based co-clustering framework with GMM priors, scale-adjusted ELBO, and MI cross-loss to address noise and posterior collapse. It jointly clusters rows and columns and outperforms prior methods on noisy, high-dimensional datasets.
Abstract: Co-clustering exploits the duality of instances and features to simultaneously uncover meaningful groups in both dimensions, often outperforming traditional clustering in high-dimensional or sparse data settings. Although recent deep learning approaches successfully integrate feature learning and cluster assignment, they remain susceptible to noise and can suffer from posterior collapse within standard autoencoders. In this paper, we propose a variational co-clustering framework that combines VAEs with Gaussian mixture priors to jointly learn row and column cluster assignments in latent space, incorporating doubly reparameterized gradients and scale modifications to address posterior collapse and improve training stability. Our unsupervised model integrates a Variational Deep Embedding with a Gaussian Mixture Model (GMM) prior for both instances and features, providing a built-in clustering mechanism that naturally aligns latent modes with row and column clusters. Furthermore, our regularized end-to-end noise learning Compositional ELBO architecture jointly reconstructs the data while regularizing against noise through the KL divergence, thus gracefully handling corrupted or missing inputs in a single training pipeline. To counteract posterior collapse, we introduce a scale modification that increases the encoder’s latent means only in the reconstruction pathway, preserving richer latent representations without inflating the KL term. Finally, a mutual information-based cross-loss ensures coherent coclustering of rows and columns. Empirical results on diverse real-world datasets from multiple modalities, numerical, textual, and image-based, demonstrate that our method not only preserves the advantages of prior Co-clustering approaches but also exceeds them in accuracy and robustness, particularly in high-dimensional or noisy settings.
Primary Area: probabilistic methods (Bayesian methods, variational inference, sampling, UQ, etc.)
Code Of Ethics: true
Submission Guidelines: true
Anonymous Url: true
No Acknowledgement Section: true
Submission Number: 22631
Loading