Sparse network initialization using deterministic Ramanujan graphs

Arindam Biswas; Suryam Arnav Kalra; Pabitra Mitra; BISWAJIT BASU

Sparse network initialization using deterministic Ramanujan graphs

Arindam Biswas, Suryam Arnav Kalra, Pabitra Mitra, BISWAJIT BASU

Published: 18 Jun 2024, Last Modified: 24 Jul 2024TF2M 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: zero shot initialization, lottery ticket hypothesis, expander networks, Ramanujan graphs

TL;DR: We introduce a sparsely connected neural network architecture inspired by Ramanujan graphs, which achieves performance comparable to dense net- works, with the ultimate goal of sparse LLMs

Abstract: We introduce a sparsely connected neural network architecture inspired by Ramanujan graphs, which achieves performance comparable to dense networks. They are constructed from Cayley graphs of specific algebraic groups or as Ramanujan $r$-coverings of the full $(k,l)$ bi-regular bipartite graph with $k + l$ vertices. This novel method employs zero-shot, data-independent, deterministic pruning at initialization, facilitating early identification of winning lottery tickets. Unlike traditional methods that rely on iterative processes to find these tickets, our technique identifies them at the outset. Our ultimate goal is to construct sparse, scalable Foundation Models. Experimental results demonstrate that our proposed architecture achieves competitive accuracy and sparsity ratios comparable to those obtained by previous pre-training pruning algorithms.

Submission Number: 8

Loading