DataMUX: Data Multiplexing for Neural Networks

Vishvak Murahari; Carlos E Jimenez; Runzhe Yang; Karthik R Narasimhan

DataMUX: Data Multiplexing for Neural Networks

Vishvak Murahari, Carlos E Jimenez, Runzhe Yang, Karthik R Narasimhan

Published: 31 Oct 2022, Last Modified: 06 Apr 2025NeurIPS 2022 AcceptReaders: Everyone

Keywords: Neural networks, Multiplexing, Efficient inference

Abstract: In this paper, we introduce \emph{data multiplexing} (DataMUX), a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation. DataMUX demonstrates that neural networks are capable of generating accurate predictions over \emph{mixtures} of inputs, resulting in increased inference throughput with minimal extra memory requirements. Our approach uses two key components -- 1) a multiplexing layer that performs a fixed linear transformation to each input before combining them to create a "mixed" representation of the same size as a single input, which is then processed by the base network, and 2) a demultiplexing layer that converts the base network's output back into independent representations before producing predictions for each input. We show the viability of DataMUX for different architectures (Transformers, and to a much lesser extent MLPs and CNNs) across six different tasks spanning sentence classification, named entity recognition and image classification. For instance, DataMUX for Transformers can multiplex up to 20x/40x inputs, achieving up to 11x/18x increase in inference throughput with absolute performance drops of $<2\%$ and $<4\%$ respectively compared to a vanilla Transformer on MNLI, a natural language inference task. We also provide a theoretical construction for multiplexing in self-attention networks and analyze the effect of various design elements in DataMUX.

TL;DR: We present data multiplexing (DataMUX) -- a technique that enables deep neural networks to process multiple inputs simultaneously using a single compact representation and dramatically improves inference throughput

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/datamux-data-multiplexing-for-neural-networks/code)

14 Replies

Loading