GenePrune : Automated Pruning of Large Language Models for Code using Genetic Algorithm

Nikhil Reddy Varimalla; Ruturaj Godse

GenePrune : Automated Pruning of Large Language Models for Code using Genetic Algorithm

Nikhil Reddy Varimalla, Ruturaj Godse

Published: 06 Mar 2025, Last Modified: 06 Mar 2025DL4C @ ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Track: tiny / short paper (up to 4 pages)

Keywords: Code LLM, Post-Training Optimization, Pruning, Genetic Algorithms

TL;DR: GenePrune is a genetic algorithm-based pruning framework that optimizes sparsity in Code LLMs while preserving performance, enabling efficient model compression dynamically unlike rule-based baselines.

Abstract: Large Language Models (LLMs) for code generation exhibit remarkable capabilities but face deployment challenges due to their high computational and memory demands. Traditional pruning methods, often based on static heuristics like magnitude-based weight pruning, fail to effectively balance sparsity and performance, particularly for structured tasks such as code generation. To address this, we propose GenePrune, a novel genetic algorithm-based pruning framework that optimizes pruning masks for pre-trained Code LLMs without requiring costly retraining. GenePrune iteratively refines pruning configurations through evolutionary operations such as crossover and mutation, guided by a fitness function that balances model sparsity and task-specific performance. Experiments on open-source models like CodeT5 demonstrate that GenePrune achieves superior pruning efficiency, significantly reducing model size while maintaining high BLEU scores for code generation tasks. Our results highlight GenePrune as a promising approach for efficient LLM compression, with potential applications in optimizing inference speed and deployment in resource-constrained environments.

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Presenter: ~Nikhil_Reddy_Varimalla1

Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.

Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.

Submission Number: 58

Loading