Replacing Multi-Layer Perceptron with Taylor Expansion

16 Sept 2025 (modified: 18 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: MLP, efficient inference
TL;DR: We replace the MLP with a lightweight module that combines K-means clustering and Taylor expansion, enabling highly efficient inference.
Abstract: The Multi-Layer Perceptron (MLP) constitutes a fundamental building block of deep learning. However, its inherent reliance on dense matrix multiplication imposes a substantial computational burden, especially on resource-constrained edge devices, which poses a critical challenge to the real-time inference requirements of edge artificial intelligence applications. In this paper, we propose T-MLP, a novel method that mitigates this challenge by approximating the MLP via a light-weight module that combines K-means and first-order Taylor expansion. Replaced by T-MLP, MLP online inference is reduced to a single K-means assignment, followed by one dense matrix multiplication and two element-wise additions. On a commercial edge device, T-MLP yields$>100\\times$ speed-up on large-scale MLP with $<1\\%$ accuracy loss. Grounded in reduced time-complexity and hardware-friendly footprint, T-MLP establishes a new paradigm for edge-side efficient inference.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 7137
Loading