Characterizing, Detecting, and Correcting Comment Errors in Smart Contract Functions

Yutong Cheng, Haowen Yang, Zhengda Li, Lei Tian

Published: 01 Jan 2024, Last Modified: 07 Oct 2025SSE 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: NatSpec comments play an essential role in smart contracts. Their clear and informative format helps users gain an accurate understanding of smart contract functions and diminish financial risk. However, widespread non-adherence to NatSpec standards currently causes confusion for both end-users and developers. Current research often neglects the importance of NatSpec formats or solely emphasizes user-centric comments in smart contract generation. This oversight can hinder contract trustworthiness, code reusability, maintenance efficiency, and ultimately, the development of the community ecosystem. To bridge this gap, this paper presents the first empirical study on 253 verified contracts encompassing 16,620 functions from Etherscan, uncovering that 87 % of the smart contract functions have Comment Errors (CE) and pinpointing prevalent deviation patterns. Based on our findings, we propose CETerminator, an automated approach for detecting and rectifying CE in smart contract functions. Due to the scarcity of NatSpec-compliant comments for collected smart contract functions, CETerminator employs in-context learning on a large language model to generate NatSpec comments. The approach then compares the original and the generated comments, utilizing corpus-driven heuristic rules to identify and correct diverse error categories in the original comments. In our evaluation, CETerminator demonstrates a high token overlap rate for addressing missing comments. In addition, the average precision, recall, and F1-scores for handling inconsistency comments are 85.28 %, 86.48 %, and 85.85%, respectively, outperforming the baseline by 39.79%, 39.53%, and 39.84%.