H-QLoRA: Enhancing Quantized LLMs with Hierarchical Residual Learning

Alexander Huang-Menders; Kevin Lin; Yu-Wing Tai

H-QLoRA: Enhancing Quantized LLMs with Hierarchical Residual Learning

Alexander Huang-Menders, Kevin Lin, Yu-Wing Tai

22 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: parameter efficient fine-tuning (PEFL), quantized LLMs, LoRA, hierarchical learning

Abstract: Fine-tuning large language models (LLMs) in resource-constrained environments poses significant challenges due to their size and computational demands. While current methods often rely on aggressive weight quantization to alleviate memory and computational costs, this can lead to a noticeable loss of accuracy. This paper introduces H-QLoRA, a novel approach that leverages hierarchical adaptors with low-rank weights to enhance performance. By fine-tuning models from the LLaMA and Gemma families, we demonstrate H-QLoRA's efficacy across multiple instruction datasets. H-QLoRA not only outperforms state-of-the-art results for certain model types by recovering high-frequency information lost during 4-bit weight quantization, but it also maintains efficiency in terms of inference costs and memory usage. While traditional methods may compromise accuracy in pursuit of efficiency, H-QLoRA mitigates this issue by implementing a hierarchical adaptor structure that captures more nuanced patterns within the data. This allows H-QLoRA to fine-tune models with the same number of trainable parameters as QLoRA, yet it proves to be more optimal for specific architectures. Overall, H-QLoRA aims to enhance fine-tuning outcomes for quantized models in low-resource environments.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2651

Loading