Sinusoidal Initialization, Time for a New Start

Alberto Fernández-Hernández; Jose I. Mestre; Manuel F. Dolz; José Duato; Enrique S. Quintana-Orti

Sinusoidal Initialization, Time for a New Start

Alberto Fernández-Hernández, Jose I. Mestre, Manuel F. Dolz, José Duato, Enrique S. Quintana-Orti

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Initialization, Sinusoidal, Deep neural network, Activation

TL;DR: Sinusoidal initialization replaces random weight seeding with a deterministic, structured scheme that balances weights and neuron activations from the outset, yielding faster, more stable training and higher accuracy across diverse models.

Abstract: Initialization plays a critical role in Deep Neural Network training, directly influencing convergence, stability, and generalization. Common approaches such as Glorot and He initializations rely on randomness, which can produce uneven weight distributions across layer connections. In this paper, we introduce the Sinusoidal initialization, a novel deterministic method that employs sinusoidal functions to construct structured weight matrices expressly to improve the spread and balance of weights throughout the network while simultaneously fostering a more uniform, well‑conditioned distribution of neuron activation states from the very first forward pass. Because Sinusoidal initialization begins with weights and activations that are already evenly and efficiently utilized, it delivers consistently faster convergence, greater training stability, and higher final accuracy across a wide range of models, including convolutional neural networks, vision transformers, and large language models. On average, our experiments show an increase of 4.8 % in final validation accuracy and 20.9 % in convergence speed. By replacing randomness with structure, this initialization provides a stronger and more reliable foundation for Deep Learning systems.

Supplementary Material: zip

Primary Area: Deep learning (e.g., architectures, generative models, optimization for deep networks, foundation models, LLMs)

Submission Number: 27102

Loading