DAF: DYNAMIC ADAPTIVE FINE-TUNING OF VISION TRANSFORMERS

18 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Parameter-Efficient Fine-Tuning, Vision Transformers, Dynamic Fine-tuning, Model Adaptation
TL;DR: We challenge the conventional static fine-tuning paradigm by proposing a dynamic adaptive method (DAF) that intelligently reshapes the model's trainable structure during training to adapt to evolving optimization priorities.
Abstract: Parameter-Efficient Fine-Tuning (PEFT) is essential for training large Vision Transformers (ViTs), yet existing methods are fundamentally constrained by a static allocation paradigm, where trainable parameters are fixed before training. We argue this static approach overlooks the evolving optimization priorities of a model during learning, thereby limiting its final performance under a constrained parameter budget. Inspired by the sparse dynamic activation mechanism of neurons in the brain, we introduce a novel dynamic reconfiguration paradigm for PEFT and propose a framework named Dynamic Adaptive Fine-tuning (DAF). The core of DAF lies in its ability to periodically evaluate, select, and reshape its trainable structure during training. It employs our proposed context-aware decoupled sensitivity analysis method to purely assess the backbone network’s potential while preserving the full learning context. Subsequently, it executes the proposed Rebuild-and-Refocus update strategy. This strategy uniquely preserves learned knowledge by freezing outdated fine-tuning modules while decisively reallocating the entire parameter budget to newly identified critical regions. Extensive experiments on several highly challenging vision benchmarks show that the DAF framework not only significantly outperforms mainstream static PEFT methods but also achieves SOTA performance. Our work fundamentally challenges the static nature of the PEFT field and opens a new avenue for adapting large pretrained models more intelligently and efficiently. The code is available at https://anonymous.4open.science/r/DAF-9372.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 11582
Loading