6

C. Sun et al.

Source Code

Code
Pre-Processing

Test Suite Synthesis
test 1

…

test n

Parse

test 1

…

test m

Prune

test 1

…

test k

Select by Test Coverage

Filtering Tests
test 1

…

Filtered Invariants

Invariant Synthesis
Prompt

inv 1

…

inv k

test n

Execute

inv 1

…

inv q

Refinement

Fig. 3: Overview of ClassInvGen.

Generation by LLM After building the source program AST, ClassInvGen
uses LLMs to analyze the class module and infers both invariants for the target
class and tests that exercise the class’s implementation as thoroughly as possible.
ClassInvGen uses a fixed system prompt that defines class invariants and outlines two main tasks: (1) generating class invariants from the source code, and
(2) creating a test suite of valid API calls without specifying expected outputs
(Figure 4 and Figure 23 in Appendix). Next, ClassInvGen instantiates a user
prompt template (Figure 24 in Appendix) with the actual target class.
From the source program AST, ClassInvGen identifies program dependencies
and populates the prompt template with the leaf struct/class. Starting from the
leaf nodes, ClassInvGen leverages previously generated invariants by including
them in the prompt for later classes. To accommodate the LLM’s context window limit, only the relevant child classes of the current target class are included
in the prompt, with method implementations and private fields/methods hidden
when necessary. An algorithm for this process is presented in Algorithm 1.
Algorithm 1 presents the invariant generation process for a source program
AST. The main function GenerateInvariant takes a target_class and leverages
a caching mechanism through invariants_dict to avoid redundant computations
(Line 3). The algorithm first collects dependent classes via getClassRecursively
(Line 4) and sorts them using reverseToplogicalSort to ensure dependencyaware processing (Line 5). For each class in the sorted order, it constructs
the necessary context by obtaining the class code through getCodeForClass
(Line 7). For each dependency dep of the current class, it retrieves the dep_code
and concatenates it with class_code (Lines 10–11). The algorithm then generates
invariants using generateInvariantWithLLM and stores them in invariants_dict
(Line 12). The helper function getCodeForClass constructs class representations
by combining the declaration text with any existing invariants from invariants_dict
(Lines 16–18). It optionally includes method bodies based on context window
constraints. This approach ensures efficient invariant generation while maintaining all necessary context and dependencies (Line 20). The algorithm concludes
by returning final_invariants for the target class, effectively managing the invariant generation process while respecting LLM context limitations.

