LoFTPat: Low-Rank Subspace Optimization for Parameter-Efficient Fine-Tuning of Genomic Language Models in Pathogenicity Identification

Published: 05 Mar 2025, Last Modified: 25 Apr 2025MLGenX 2025EveryoneRevisionsBibTeXCC BY 4.0
Track: Main track (up to 8 pages)
Abstract: Pathogen identification from genomic sequences is vital for disease surveillance, antimicrobial resistance monitoring, and vaccine development. While Large Language Models (LLMs) excel in genomic sequence modeling, existing approaches prioritize accuracy over efficiency, leading to high memory overhead, long training times, and scalability issues. We introduce LoFTPat, a structurally constrained fine-tuning framework that integrates Low-Rank Adaptation within PathoLM’s self-attention layers, enabling efficient task-specific weight modulation. LoFTPat reduces training time by 4.02\%, GPU memory usage by 64.3\%, and trainable parameters by 99.24\%, while surpassing full fine-tuning approaches with +0.44\% accuracy, +0.44\% F1 score, +0.02\% AUC-ROC, and +0.52\% balanced accuracy. It efficiently adapts to short- and long-read sequences, demonstrating strong generalization across bacterial and viral pathogens. By optimizing feature transformations with minimal parameter overhead, LoFTPat offers a scalable, computationally efficient framework for large-scale pathogen classification and genomic analysis.
Submission Number: 68
Loading