Compact Decomposition of Irregular Tensors for Data Compression: From Sparse to Dense to High-Order Tensors
Abstract: An irregular tensor is a collection of matrices with different numbers of rows. Real-world data from diverse domains, including medical and stock data, are effectively represented as irregular tensors due to the inherent variations in data length. For their analysis, various tensor decomposition methods (e.g., PARAFAC2) have been devised. While they are expected to be effective in compressing large-scale irregular tensors, akin to regular tensor decomposition methods, our analysis reveals that their compression performance is limited due to the larger number of first mode factor matrices.In this work, we propose accurate and compact decomposition methods for lossy compression of irregular tensors. First, we propose Light-IT, which unifies all first mode factor matrices into a single matrix, dramatically reducing the size of compressed outputs. Second, motivated by the success of Tucker decomposition in regular tensor compression, we extend Light-IT to Light-IT++ to enhance its expressive power and thus reduce compression error. Finally, we generalize both methods to handle irregular tensors of any order and leverage the sparsity of tensors for acceleration.Extensive experiments on 6 real-world datasets demonstrate that our methods are (a) Compact: their compressed output is up to 37× smaller than that of the most concise baseline, (b) Accurate: our methods are up to 5× more accurate, with smaller compressed output, than the most accurate baseline, and (c) Versatile: our methods are effective for sparse, dense, and higher-order tensors.
Loading