TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree

Yu-Yang Qian; Yuan-Ze Xu; Zhen-Yu Zhang; Peng Zhao; Zhi-Hua Zhou

TreeLoRA: Efficient Continual Learning via Layer-Wise LoRAs Guided by a Hierarchical Gradient-Similarity Tree

Yu-Yang Qian, Yuan-Ze Xu, Zhen-Yu Zhang, Peng Zhao, Zhi-Hua Zhou

Published: 01 May 2025, Last Modified: 15 Aug 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Many real-world applications collect data in a streaming environment, where learning tasks are encountered sequentially. This necessitates *continual learning* (CL) to update models online, enabling adaptation to new tasks while preserving past knowledge to prevent catastrophic forgetting. Nowadays, with the flourish of *large pre-trained models* (LPMs), *efficiency* has become increasingly critical for CL, due to their substantial computational demands and growing parameter sizes. In this paper, we introduce TreeLoRA (K-D Tree of Low-Rank Adapters), a novel approach that constructs *layer-wise* adapters by leveraging hierarchical gradient similarity to enable efficient CL, particularly for LPMs. To reduce the computational burden of task similarity estimation, we employ *bandit* techniques to develop an algorithm based on lower confidence bounds to efficiently explore the task structure. Furthermore, we use sparse gradient updates to facilitate parameter optimization, making the approach better suited for LPMs. Theoretical analysis is provided to justify the rationale behind our approach, and experiments on both *vision transformers* (ViTs) and *large language models* (LLMs) demonstrate the effectiveness and efficiency of our approach across various domains, including vision and natural language processing tasks.

Lay Summary: Modern AI systems often need to learn new tasks continuously, but when they learn something new, they tend to forget previous knowledge - a problem called "catastrophic forgetting." We developed TreeLoRA, a method that organizes past learning experiences in a tree-like structure and uses smart adaptive search techniques to help large AI models efficiently learn new tasks while remembering old ones. Our approach achieves up to 3.2 times faster training compared to existing methods while maintaining good performance, making it more practical to deploy continuously learning AI systems in real-world applications that need to continuously adapt without forgetting.

Link To Code: https://github.com/ZinYY/TreeLoRA

Primary Area: General Machine Learning->Sequential, Network, and Time Series Modeling

Keywords: Continual Learning, Supervised Fine-Tuning, Large Pre-trained Models, Low-Rank Adaptation, Hierarchical Task Structure

Flagged For Ethics Review: true

Submission Number: 3715

Loading