Abstract: Neural Architecture Search (NAS) facilitates the automatic effective neural network designs while requiring the substantial computational resource particularly for language models. Zero-shot NAS exploits Zero-Cost (ZC) proxies to estimate model performance, thereby markedly reducing computational demands. However, existing ZC proxies rely heavily on in-depth expert knowledge and repetitive trial-and-error costs. Moreover, most of existing ZC proxies fail to surpass the performance of the naive baseline (number of parameters).To address these challenges, we introduce a novel framework called \textbf{LPZero} (\textbf{L}anguage model zero-cost \textbf{P}roxy search from \textbf{Zero}). It is designed to automate the design of efficient proxies for language models, and achieve the higher performance estimation ratio.Specifically, we initially consolidate existing ZC proxy designs into a unified framework to serve as the search space, and then apply an evolutionary algorithm to heuristically identify new, promising proxy candidates for language models. To enhance the efficiency of the search process, we introduce a Predictive-Pruning Strategy (PPS). This strategy is designed to preemptively eliminate unpromising proxies, thereby mitigating the risk of proxy degradation. Extensive experiments on the FlexiBERT and GPT-2 search space demonstrate the effectiveness of our algorithm. Notably, the consistency in performance ranking achieved by our method significantly surpasses that observed with current proxies.
Paper Type: long
Research Area: Machine Learning for NLP
Contribution Types: Approaches to low-resource settings, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies
Loading