Abstract: Code search, which involves retrieving relevant code snippets from a codebase based on natural language queries, has become increasingly important with the widespread adoption of code pre-trained models in software engineering. However, in the fine-tuning stage of code search, the issues of representation consistency and semantic alignment between the code and natural language query modalities have not been sufficiently investigated. In this study, we propose CSMM, a fine-tuning method for code search that leverages momentum contrastive learning and cross-modal matching. Specifically, we integrate momentum contrastive learning into the code search fine-tuning process by using the momentum encoder to create cross-modal momentum representations. We maintain queue-based dictionaries for consistent negative samples and achieve semantic alignment of code and query in a shared embedding space through attention mechanisms in cross-modal matching. Experimental results demonstrate that CSMM significantly enhances the performance of pre-trained models on code search tasks, outperforming standard fine-tuning methods.
External IDs:dblp:conf/icic/RenJZWW25
Loading