Abstract: Protocol specifications, defined in Request for Comments (RFCs), play a critical role in ensuring the correctness of protocol software systems. To check consistency, specification–implementation pairs are essential for testing and verification. However, existing efforts in specification-to-code mapping remain largely manual and are typically limited to the file level, lacking the fine-grained granularity needed for function-level analysis, which is crucial for effective consistency checking. To address this gap, we present Spec2Code, the first LLM-driven framework that automates fine-grained mapping from protocol specifications to function implementations.Given a RFC document and a protocol codebase, Spec2Code first performs preprocessing to extract structured specification requirements (SRs) and function-level code representations, along with contextual and dependency information. To ensure scalability, Spec2Code employs a two-stage process comprising relevance filtering and clustering-based SR organization to reduce the candidate pairs. For accuracy, Spec2Code performs fine-grained constraint-level matching on each candidate SR–function pair using LLMs, leveraging enriched context to determine whether a function fully, partially, or does not relate to an SR.We evaluate Spec2Code on real-world implementations of HTTP, TLS and BFD protocols, including Apache Httpd, Nginx, OpenSSL, BoringSSL, FRRouting, and BIRD. Experimental results show that Spec2Code outperforms four state-of-the-art baselines, achieving up to 49%, 66%, and 66% improvement in precision, recall, and F1, respectively. Additionally, Spec2Code successfully recovers the mappings for 16 known inconsistency bugs and discovers 11 previously unreported inconsistencies using an integrated lightweight consistency verifier, 5 of which have been confirmed by project developers.
External IDs:dblp:conf/kbse/WangQXWC25
Loading