QuadEnhancer: Leveraging Quadratic Transformations to Enhance Deep Neural Networks

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep Learning, LLM
Abstract: The combination of linear transformations and nonlinear activation functions forms the foundation of most modern deep neural networks, enabling them to approximate highly complex functions. This paper explores the introduction of quadratic transformations to further increase the nonlinearity of the model, with the aim of enhancing the performance of existing architectures. To minimize the additional parameters and computational burden, we propose a lightweight quadratic enhancer that leverages matrix decomposition, weight sharing, and sparsification techniques. This approach introduces only a minimal and negligible increase in parameters and forward computation, while still yielding substantial improvements in model performance. We evaluate the effectiveness of the proposed method across three tasks: text classification, image classification, and fine-tuning large language models (LLMs). In all tasks, our approach demonstrates significant performance gains.
Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)
Submission Number: 26694
Loading