Keywords: Continual Learning, LoRA, Model Merging, Class Incremental Learning, PEFT
TL;DR: HAM tackles catastrophic forgetting by dynamically grouping similar task adapters, concatenating within groups after pruning, and merging across groups , showing superior performance on long task sequences
Abstract: Continual Learning allows models to acquire knowledge incrementally, but is challenged by catastrophic forgetting, a phenomenon in which the learning new tasks disrupts previously acquired knowledge.
Although large pre-trained models can partially mitigate forgetting by leveraging their existing knowledge and over-parameterization, they often struggle when confronted with novel data distributions.
Parameter-Efficient Fine-Tuning (PEFT) methods, such as LoRA, enable efficient adaptation to new data.
However, they still face challenges in scaling to dynamic learning scenarios and long sequences of tasks, as maintaining one adapter per task introduces complexity and increases the potential for interference.
In this paper, we introduce Hierarchical Adapters Merging (HAM), a novel framework that dynamically combines adapters from different tasks during training.
For each experience, HAM trains a low-rank adapter along with an importance scalar, then dynamically groups tasks based on adapter similarity.
Within each group, adapters are pruned, scaled and merged, facilitating transfer learning between related tasks.
Extensive experiments on three vision benchmarks demonstrate that HAM surpasses state-of-the-art methods, achieving up to 4\% accuracy improvement over the best baseline and nearly doubling efficiency in both training and inference, with particularly strong advantages as the number of tasks increases.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 11575
Loading