($\texttt{PASS}$) Visual Prompt Locates Good Structure Sparisty through a Recurent HyperNetwork

Tianjin Huang; Meng Fang; Li Shen; Fan Liu; Yulong Pei; Mykola Pechenizkiy; Shiwei Liu; Tianlong Chen

($\texttt{PASS}$) Visual Prompt Locates Good Structure Sparisty through a Recurent HyperNetwork

Tianjin Huang, Meng Fang, Li Shen, Fan Liu, Yulong Pei, Mykola Pechenizkiy, Shiwei Liu, Tianlong Chen

22 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: general machine learning (i.e., none of the above)

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: Channel Prunning, Visual Prompt, Sparse Neural Network

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We proposed a novel method $\textit{i.e.}$, $\texttt{PASS}$ that is a tailored hyper-network to take both visual prompts and network weight statistics as input, and output layer-wise channel sparsity in a recurrent manner.

Abstract: Large-scale neural networks have demonstrated remarkable performance in different domains like vision and language processing, although at the cost of massive computation resources. As illustrated by compression literature, structured model pruning is a prominent algorithm to encourage model efficiency, thanks to its acceleration-friendly sparsity patterns. One of the key questions of structural pruning is how to estimate the channel significance. In parallel, work on data-centric AI has shown that prompting-based techniques enable impressive generalization of large language models across diverse downstream tasks. In this paper, we investigate a charming possibility - *leveraging visual prompts to capture the channel importance and derive high-quality structural sparsity*. To this end, we propose a novel algorithmic framework, namely \texttt{PASS}. It is a tailored hyper-network to take both visual prompts and network weight statistics as input, and output layer-wise channel sparsity in a recurrent manner. Such designs consider the intrinsic channel dependency between layers. Comprehensive experiments across multiple network architectures and six datasets demonstrate the superiority of $\texttt{PASS}$ in locating good structural sparsity. For example, at the same FLOPs level, $\texttt{PASS}$ subnetworks achieve 1\%$\sim$3\% better accuracy on Food101 dataset; or with a similar performance of 80\% accuracy, $\texttt{PASS}$ subnetworks obtain 0.35$\times$ more speedup than the baselines. Codes are provided in the supplements.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6393

Loading