Keywords: Learngene; Parameter Initialization
Abstract: Parameter initialization plays a critical role when building and training diverse models under different scenarios. The recently proposed Learngene framework firstly learns one compact parameter set, termed learngene, from a large well-trained Ancestry model (Ans-Net), which is then inherited and transformed to initialize diverse descendant models (Des-Nets). One central goal in this framework is the
pursuit of maximal inheritable efficiency, which entails learning a parameter set that is dramatically more compact than the Ans-Net. However, existing methods typically fall short in this part, thus limiting the portability of these parameters across diverse initialization scenarios. To diagnose this limitation, we rethink one state-of-the-art method LeTs, and reveal that its transformation matrices, which account for the majority of inherited parameters, are substantially overparameterized. Inspired by this insight, we introduce ALT (Adaptive Low-rank Transformation) to fundamentally improve inheritable-parameter efficiency. Specifically, we propose one novel SVD-inspired metric termed Gated Importance Score, building on which two distinct adaptation strategies, namely Flat Global Adaptation and Hierarchical Component Adaptation, dynamically refine the transformation matrices while preserving the initialization quality for Des-Nets. Comprehensive experiments across both vision and language domains demonstrate ALT’s state-of-the-art inheritable-parameter efficiency and superior downstream performance.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 15428
Loading