Learning activation functions with PCA on a set of diverse piecewise-linear self-trained mappings

ICLR 2026 Conference Submission19895 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep neural networks, Activation function learning, Principal Component Analysis (PCA)
TL;DR: This research proposes a novel data-driven approach to deep neural network activation function design, using a subnetwork paradigm to learn effective functions that are then analyzed with PCA to uncover simple, high-performing analytical forms.
Abstract: This work explores a novel approach to learning activation functions, moving beyond the current reliance on human-engineered designs like the ReLU. Activation functions are crucial for the performance of deep neural networks, yet selecting an optimal one remains challenging. While recent efforts have focused on automatically searching for these functions using a parametric approach, our research does not assume any predefined functional form and lets the activation function be approximated by a subnetwork within a larger network, following the Network in Network (NIN) paradigm. We propose to train several networks on a range of problems to generate a diverse set of effective activation functions, and subsequently apply Principal Component Analysis (PCA) to this collection of functions to uncover their underlying structure. Our experiments show that only a few principal components are enough to explain most of the variance in the learned functions, and that these components have in general a simple, identifiable analytical form. Experiments using the analytical function form achieve state of the art performance, highlighting the potential of this data-driven approach to activation function design.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 19895
Loading