Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks

Jialin Zhao; Yingtao Zhang; Xinghang Li; Huaping Liu; Carlo Vittorio Cannistraci

Sparse Spectral Training and Inference on Euclidean and Hyperbolic Neural Networks

Jialin Zhao, Yingtao Zhang, Xinghang Li, Huaping Liu, Carlo Vittorio Cannistraci

Published: 05 Mar 2025, Last Modified: 10 Apr 2025SLLMEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 4 pages)

Keywords: Parameter-efficient training; Pre-training; Hyperbolic Network

Abstract: The increasing GPU memory demands of large language models call for more memory-efficient training methods. Existing approaches like LoRA struggle with low-rank constraints in pre-training, while ReLoRA suffers from saddle point issues. We propose **Sparse Spectral Training (SST)**, a memory-efficient **pre-training** framework that *updates all singular values*, *selectively updates singular vectors* via multinomial sampling, and *leverages singular value decomposition (SVD) for initialization and periodic reinitialization*, reducing distortion compared to other low-rank methods. Across tasks including language generation, machine translation, and graph learning, SST outperforms existing memory-efficient training methods and is often comparable to full-rank training. On LLaMA-1.3B, SST reduces the perplexity gap to full-rank training by **97.4\%**, demonstrating its effectiveness for scalable, memory-efficient model pre-training. Our code is available at https://anonymous.4open.science/r/sparse_spectral_training-6A2C/.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 73

Loading