Practical Shuffle Coding

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: graph compression, entropy coding, bits-back coding, lossless compression, generative models, information theory, probabilistic models, graph neural networks, multiset compression, asymmetric numeral systems, compression, entropy, shuffle coding
TL;DR: We present a general method for practical lossless compression of unordered data structures that achieves state-of-the-art rates and speeds on large graphs.
Abstract: We present a general method for lossless compression of unordered data structures, including multisets and graphs. It is a variant of shuffle coding that is many orders of magnitude faster than the original and enables 'one-shot' compression of single unordered objects. Our method achieves state-of-the-art compression rates on various large-scale network graphs at speeds of megabytes per second, efficiently handling even a multi-gigabyte plain graph with one billion edges. We release an implementation that can be easily adapted to different data types and statistical models.
Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)
Submission Number: 15389
Loading