Latent Union Completion

Published: 20 Oct 2025, Last Modified: 05 May 20262025 IEEE International Symposium on Information Theory (ISIT)EveryoneCC BY 4.0
Abstract: Large amounts of missing data are becoming increasingly ubiquitous in modern high-dimensional datasets. Highrank matrix completion (HRMC) uses the powerful union of subspace (UoS) model to handle these vast amounts of missing data. However, existing HRMC methods often fail when dealing with real data that does not follow the UoS model exactly. Here we propose a new approach: instead of finding a UoS that fits the observed data directly, we will find a UoS in a latent space that can fit a non-linear embedding of the original data. Embeddings of this kind are typically attained with deep architectures. However, the abundance of missing data impedes the training process, as the coordinates of the observed samples rarely overlap. We overcome this difficulty with a novel pseudo-completion layer (in charge of estimating the missing values) followed by an autoencoder (in charge of finding the embedding) coupled with a self-expressive layer (that clusters data according to a UoS in the latent space). Our design reduces the exponential memory requirements that are typically induced by uneven patterns of missing data. We give exact details of our architecture, model, loss functions, and training strategy. Our experiments on several real datasets show that our method consistently outperforms the state-of-the-art accuracy by more than a staggering 40%.
Loading