Keywords: code intelligence, defect prediction, program analysis, AI for software engineering
Abstract: Line-level defect prediction aims to precisely localize defect-prone code, yet its effectiveness is often limited by insufficient inter-line context modeling and weak coordination across granularities.
To make line-level defect prediction more effective, We propose PHLDP, a $\textbf{P}$DG-to-sequence $\textbf{H}$ierarchical $\textbf{L}$ine-level $\textbf{D}$efect $\textbf{P}$rediction model that jointly learns defect patterns at both file and line levels.
PHLDP improves effectiveness through (1) PDG-to-sequence conversion to preserve control and data dependencies, (2) hierarchical representation learning for local and global semantics, and (3) dual-level supervision to jointly optimize file-level and line-level predictions.
Experiments on multiple open-source projects show that PHLDP consistently outperforms state-of-the-art baselines in both within-project and cross-project settings, particularly on effort-aware metrics, validating its practical effectiveness for line-level defect prediction.
Our code is available at: https://anonymous.4open.science/r/PHLDP-7181/
Paper Type: Long
Research Area: Code Models
Research Area Keywords: NLP Applications, Language Modeling
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models
Languages Studied: Code language, Java
Submission Number: 4271
Loading