Leveraging meta-data of code for adapting prompt tuning for code summarization

Zhihua Jiang, Di Wang, Dongning Rao

Published: 01 Jan 2025, Last Modified: 18 Apr 2025Appl. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Prompt tuning alleviates the gap between pre-training and fine-tuning and achieves promising results in various natural language processing (NLP) tasks. However, it is nontrivial for adapting prompt tuning in intelligent code tasks since code-specific knowledge such as abstract syntax tree is usually hierarchy-structured and therefore is hard to be converted into plain text. Recent works (e.g., PT4Code) introduce simple task prompts along with a programming language indicator into prompt template, achieving improvement over non-prompting state-of-the-art code models (e.g., CodeT5). Inspired by this, we propose a novel code-specific prompt paradigm, meta-data prompt, which introduces semi-structured code’s meta-data (attribute-value pairs) into prompt template and facilitates the adaption of prompt tuning techniques into code tasks. Specifically, we find the usage of diverse meta-data attributes and their combinations and employ the OpenPrompt to implement a meta-data prompt based code model, PRIME (PRompt tunIng with MEta-data), via utilizing CodeT5 as the backbone model. We experiment PRIME with the source code summarization task on the publicly available CodeSearchNet benchmark. Results show that 1) using good meta-data can lead to an improvement on the model performance; 2) the proposed meta-data prompt can be combined with traditional task prompt for further improvement; 3) our best-performing model can consistently outperform CodeT5 by an absolute score of 0.73 and PT4Code by an absolute score of 0.48 regarding the averaged BLEU metric across six programming languages.