['6,9c6,12', '< Over the past few years, large language models have been observed to attain significant advancements in coding capabilities (OpenAI, 2023;Touvron et al., 2023). Meanwhile, models designed specifically for coding tasks have also been introduced (Rozière et al., 2023;Lozhkov et al., 2024;Pinnaparaju et al., 2024). Although LLMs can proficiently generate simple code snippets, they suffer from a decline in performance as code requirements become complicated.', '< Numerous efforts have been made to tackle this complexity. The two-stage methods (Jiang et al., 2023;Zelikman et al., 2023) employ the plan-and-solve strategy, which first generates a draft outline for the complex task and uses it as guidance for implementing the code in the second stage. Multiagent development frameworks (Hong et al., 2024;Qian et al., 2023) mimic real-world software development workflows, assign different roles to LLMs and collaborate to solve a complex goal. Self-improvement (Shinn et al., 2023;Chen et al., 2024), on the other hand, refines the program in accordance with execution feedback from self-generated unit tests.', '< Despite fruitful efforts made by the previous methods in dealing with complex problems, certain challenges still remain unsolved: (1) Two-stage approaches need to design a complete plan at the beginning and lack the ability to adjust the top-level design during implementation, leading to suboptimal decomposition. (2) Multi-agent collaboration frameworks are cumbersome and rely heavily on LLM capabilities, making them difficult to generalize to smaller open-source models. (3) Code refinement through self-tests depends on the correctness of generated unit-tests. Our preliminary study ( §3.1.3) finds that models generate unreliable self-tests in abundance. These incorrect tests may mislead self-improvement and, at worse, exacerbate program errors.', '< To address these issues, we propose FUNCODER, a code generation framework utilizing a divide-andconquer strategy and a novel functional consensus mechanism on functions to decompose complex problems. Starting from the main problem, FUNCODER introduces new functions to cope with certain sub-problems. The new functions will be decomposed recursively, eventually forming a tree of functions. FUNCODER then combines functions bottom-up to achieve increasingly complicated objectives. By dividing-and-conquering tasks into simpler sub-functions, complexity can be gradually reduced. However, errors in sub-functions may propagate to the whole program, thereby damaging overall reliability. We propose functional consensus that samples multiple functions and selects the one demonstrating consensus, measured by the aggregated similarity among candidates. By reaching a consensus, we reduce the discrepancies in code behavior and thus alleviate cascading errors.', '---', '> The landscape of code generation has been dramatically reshaped by the advent of large language models (LLMs) (OpenAI, 2023; Touvron et al., 2023), alongside specialized coding models (Rozière et al., 2023; Lozhkov et al., 2024; Pinnaparaju et al., 2024). While these models excel at generating straightforward code snippets, their performance significantly degrades when confronted with intricate and multifaceted programming requirements. This performance gap highlights a critical challenge: enabling LLMs to reliably produce complex, correct, and robust code.', '> ', '> To address this challenge, various strategies have been explored. Two-stage methods (Jiang et al., 2023; Zelikman et al., 2023) adopt a plan-and-solve paradigm, initially generating a high-level outline before proceeding to detailed code implementation. Multi-agent development frameworks (Hong et al., 2024; Qian et al., 2023) simulate real-world software engineering processes by assigning distinct roles to LLMs that collaborate towards a common goal. Furthermore, self-improvement techniques (Shinn et al., 2023; Chen et al., 2024) aim to refine generated programs using execution feedback derived from automatically generated unit tests.', '> ', '> Despite these advancements, persistent limitations hinder the robust generation of complex code. First, two-stage planning approaches often necessitate a complete upfront design, lacking the flexibility to dynamically adapt the top-level plan during the iterative implementation process, which can lead to suboptimal problem decomposition. Second, multi-agent collaboration frameworks are inherently resource-intensive and heavily dependent on the capabilities of powerful LLMs, limiting their generalizability and applicability to smaller, open-source models. Third, and critically, code refinement through self-tests is fundamentally constrained by the accuracy and reliability of the generated unit tests. Our preliminary investigations ( §3.1.3) reveal that LLMs frequently produce an abundance of unreliable self-tests, which can mislead the self-improvement process and, in severe cases, exacerbate existing program errors.', '> ', '> To overcome these critical limitations, we introduce FUNCODER, a novel code generation framework built upon a dynamic divide-and-conquer strategy coupled with an innovative functional consensus mechanism. FUNCODER addresses complex problems by recursively decomposing them into a hierarchical tree of interdependent sub-functions. This top-down decomposition reduces the complexity of individual generation steps, while a bottom-up composition phase aggregates these solved sub-functions to achieve the overall objective. While decomposition simplifies tasks, it also introduces the risk of cascading errors where faults in sub-functions propagate upwards. To mitigate this, FUNCODER employs functional consensus, a mechanism that samples multiple implementations for each function and selects the most robust one by evaluating the aggregated behavioral similarity among candidates. This consensus-driven selection significantly reduces inconsistencies in program behavior and prevents the propagation of errors, thereby enhancing the overall reliability and correctness of the generated code.', '22c25', '< -Conquer -  """return factors of x""" raise NotImplementedError() def sum_common_factors(a, b): """sum the common factors""" fa = find_factors(a) fb = find_factors(b) return sum_common(fa, fb)', '---', '> -Conquer -  \\"\\"\\"return factors of x\\"\\"\\" raise NotImplementedError() def sum_common_factors(a, b): \\"\\"\\"sum the common factors\\"\\"\\" fa = find_factors(a) fb = find_factors(b) return sum_common(fa, fb)', '24c27', '< """sum of common elements""" raise NotImplementedError()', '---', '> \\"\\"\\"sum of common elements\\"\\"\\" raise NotImplementedError()', '195c198', '< We can solve for $a$ by multiplying both sides of the congruence by the modular inverse of $27$ modulo $40$ . Since $27$ and $40$ are relatively prime , we know that $27$ has a modular inverse modulo $40$ . We can find this inverse using the Euclidean algorithm : \\ begin { align *} \\ text { gcd }(40 ,27) &= \\ text { gcd }(27 ,13) &= \\ text { gcd }(13 ,1) &= 1. \\ end { align *} Therefore , the modular inverse of $27$ modulo $40$ exists . We can find it using the Euclidean algorithm backwards : \\ begin { align *} 1 &= 13 -3 \\ cdot 4 &= 13 -3(40 -27) &= 4 \\ cdot 27 -3 \\ cdot 40. \\ end { align *} Thus , the modular inverse of $27$ modulo $40$ is $4$ . Multiplying both sides of the congruence $27a \\ equiv 17 \\ pmod {40} $ by $4$ gives $108a \\ equiv 68 \\ pmod {40} $ . Simplifying the left -hand side gives $8a \\ equiv 28 \\ pmod {40} $ . Dividing both sides by $4$ gives $2a \\ equiv 7 \\ pmod {10} $ . The smallest positive integer $a$ satisfying this congruence is $a =9 $ , and the second -smallest is $a =19 $ . Therefore , the sum of the smallest and second -smallest positive integers $a$ satisfying the congruence $27a \\ equiv 17 \\ pmod {40} $ is $9 +19=\\ boxed {28} $ .  """ get common element in two list $a$ and $b$ """ ```python def prime_factor ( x : int ) -> list :', '---', '> We can solve for $a$ by multiplying both sides of the congruence by the modular inverse of $27$ modulo $40$ . Since $27$ and $40$ are relatively prime , we know that $27$ has a modular inverse modulo $40$ . We can find this inverse using the Euclidean algorithm : \\ begin { align *} \\ text { gcd }(40 ,27) &= \\ text { gcd }(27 ,13) &= \\ text { gcd }(13 ,1) &= 1. \\ end { align *} Therefore , the modular inverse of $27$ modulo $40$ exists . We can find it using the Euclidean algorithm backwards : \\ begin { align *} 1 &= 13 -3 \\ cdot 4 &= 13 -3(40 -27) &= 4 \\ cdot 27 -3 \\ cdot 40. \\ end { align *} Thus , the modular inverse of $27$ modulo $40$ is $4$ . Multiplying both sides of the congruence $27a \\ equiv 17 \\ pmod {40} $ by $4$ gives $108a \\ equiv 68 \\ pmod {40} $ . Simplifying the left -hand side gives $8a \\ equiv 28 \\ pmod {40} $ . Dividing both sides by $4$ gives $2a \\ equiv 7 \\ pmod {10} $ . The smallest positive integer $a$ satisfying this congruence is $a =9 $ , and the second -smallest is $a =19 $ . Therefore , the sum of the smallest and second -smallest positive integers $a$ satisfying the congruence $27a \\ equiv 17 \\ pmod {40} $ is $9 +19=\\ boxed {28} $ .  \\"\\"\\" get common element in two list $a$ and $b$ \\"\\"\\" ```python def prime_factor ( x : int ) -> list :', '198,201c201,204', '< """ get a list of prime factors of number $x$ """ ret = [] i = 1 while i * i <= x : i += 1 if x % i == 0 and is_prime ( i ) : ret . append ( i ) return ret def is_prime ( x : int ) -> bool :', '< """ determine $x$ is a prime number or not """ if x < 2: return False for i in range (2 , int ( x **0.5) + 1) : if x % i == 0: return False return True def get_common ( a : list , b : list ) -> list :', '< """ get common element in two list $a$ and $b$ """ ret = [] for item in : if item in b : ret . append ( item ) return ret def sum_common_factors ( a : int , b : int ) -> int :', '< """ Return the sum of all common prime factors of $a$ and $b$ """ raise NotImplementedError () ``L et \' s think step by step and implement the following method sum_common_factors `using existing functions to solve : " Return the sum of all common prime factors of $a$ and $b$ " < Assistant >:  ', '---', '> \\"\\"\\" get a list of prime factors of number $x$ \\"\\"\\" ret = [] i = 1 while i * i <= x : i += 1 if x % i == 0 and is_prime ( i ) : ret . append ( i ) return ret def is_prime ( x : int ) -> bool :', '> \\"\\"\\" determine $x$ is a prime number or not \\"\\"\\" if x < 2: return False for i in range (2 , int ( x **0.5) + 1) : if x % i == 0: return False return True def get_common ( a : list , b : list ) -> list :', '> \\"\\"\\" get common element in two list $a$ and $b$ \\"\\"\\" ret = [] for item in : if item in b : ret . append ( item ) return ret def sum_common_factors ( a : int , b : int ) -> int :', '> \\"\\"\\" Return the sum of all common prime factors of $a$ and $b$ \\"\\"\\" raise NotImplementedError () ``L et \' s think step by step and implement the following method sum_common_factors `using existing functions to solve : " Return the sum of all common prime factors of $a$ and $b$ " < Assistant >:  ', '242,245c245,248', '< """ Return the sum of all common prime factors of $a$ and $b$ """ raise NotImplementedError () ``< Assistant >: def prime_factor ( x : int ) -> list :', '< """ get a list of prime factors of number $x$ """ ret = [] i = 1 while i * i <= x : i += 1 if x % i == 0 and is_prime ( i ) : ret . append ( i ) return ret def is_prime ( x : int ) -> bool :', '< """ determine $x$ is a prime number or not """ if x < 2: return False for i in range (2 , int ( x **0.5) + 1) : if x % i == 0: return False return True def get_common ( a : list , b : list ) -> list :', '< """ get common element in two list $a$ and $b$ """  Justification: We conduct experiments to reflect the performance of our methods in code generation ( §3.1) and mathematical reasoning ( §3.2), we also included analysis and ablation study in multiple aspects.', '---', '> \\"\\"\\" Return the sum of all common prime factors of $a$ and $b$ \\"\\"\\" raise NotImplementedError () ``< Assistant >: def prime_factor ( x : int ) -> list :', '> \\"\\"\\" get a list of prime factors of number $x$ \\"\\"\\" ret = [] i = 1 while i * i <= x : i += 1 if x % i == 0 and is_prime ( i ) : ret . append ( i ) return ret def is_prime ( x : int ) -> bool :', '> \\"\\"\\" determine $x$ is a prime number or not \\"\\"\\" if x < 2: return False for i in range (2 , int ( x **0.5) + 1) : if x % i == 0: return False return True def get_common ( a : list , b : list ) -> list :', '> \\"\\"\\" get common element in two list $a$ and $b$ \\"\\"\\"  Justification: We conduct experiments to reflect the performance of our methods in code generation ( §3.1) and mathematical reasoning ( §3.2), we also included analysis and ablation study in multiple aspects.', '371c374', '< Caption: Output 2 ```2 # check if there are cnt1 + cnt2 numbers divisible by both x and y lcm = ( x * y ) // gcd (x , y ) common_numbers = v -( v // lcm ) if common_numbers < cnt1 + cnt2 : return False return True def gcd ( a : int , b : int ) -> int :""" get the greatest common divisor of two numbers $a$ and $b$ """Info: MATH/number_theory/42 (level 5)Problem What is the sum of the smallest and second-smallest positive integers a satisfying the congruence 27a ≡ 17 (mod 40) ?Ground 62', '---', '> Caption: Output 2 ```2 # check if there are cnt1 + cnt2 numbers divisible by both x and y lcm = ( x * y ) // gcd (x , y ) common_numbers = v -( v // lcm ) if common_numbers < cnt1 + cnt2 : return False return True def gcd ( a : int , b : int ) -> int :\\"\\"\\" get the greatest common divisor of two numbers $a$ and $b$ \\"\\"\\"Info: MATH/number_theory/42 (level 5)Problem What is the sum of the smallest and second-smallest positive integers a satisfying the congruence 27a ≡ 17 (mod 40) ?Ground 62', '376c379', '< Caption: #Find the smallest and second -smallest positive integers satisfying the congruence equation # using the formula : x + k * ( m // gcd (a , m ) ) gcd , _ , _ = extended_euclid (a , m ) s1 = x s2 = x + ( m // gcd ) return s1 + s2 def find_inverse ( a : int , m : int ) -> int :""" find the inverse of $a$ modulo $m$ """ _ , x , _ = extended_euclid (a , m ) return x % m if x >= 0 else ( x + m ) % m def extended_euclid ( a : int , b : int ) -> Tuple [ int , int , int ]for Divide You are a programming copilot , you can solve a problem by writing Python functions . Your task is to : -For every turn , you need to write a Python function that returns the answer based on Current Code ( not code in chat history ) . -Do not modify function name , arg names , docstring in given functions . -You can import libraries to better solve the problem . -If a single function is too hard to solve , you can decompose it into multiple smaller functions . -You can leave new function unimplemented for now , but write the function at the end of the code and comment what the function does .def sum_common_factors( a : int , b : int ) -> int : """ Compute the sum of all common prime factors of $a$ and $b$ """ raise NotImplementedError () ``L et \' s think step by step and complete the following Python function sum_common_factors `that solves : " Compute the sum of all common prime factors of $a$ and $b$ " < Assistant >: First , I need to get the prime factors of $a$ and $b$ . Second , I can use `for `loop to find common element in two factors list . Finally , sum the common factor list and return the answer . Here is the `sum_common_factors `function : ```python def sum_common_factors ( a : int , b : int ) -> int : """ Compute the sum of all common prime factors of $a$ and $b$ """ factors_a = prime_factor ( a ) factors_b = prime_factor ( b ) common_factors = get_common ( factors_a , factors_b ) return sum ( common_factors ) def prime_factor ( x : int ) -> list : """ get a list of prime factors of number $x$ """ raise NotImplementedError () def get_common ( a : list , b : list ) -> list : """ get common element in two list $a$ and $b$ """ raise NotImplementedError def sum_common_factors ( a : int , b : int ) -> int : """ Compute the sum of all common prime factors of $a$ and $b$ """ factors_a = prime_factor ( a ) factors_b = prime_factor ( b ) common_factors = get_common ( factors_a , factors_b ) return sum ( common_factors ) def get_common ( a : list , b : list ) -> list : """ get common element in two list $a$ and $b$ """ raise NotImplementedError () ``L et \' s think step by step and complete the following Python function get_common `that solves : " get common element in two list $a$ and $b$ " < Assistant >: Here is the `get_common `function : ```python def get_common ( a : list , b : list ) -> list :', '---', '> Caption: #Find the smallest and second -smallest positive integers satisfying the congruence equation # using the formula : x + k * ( m // gcd (a , m ) ) gcd , _ , _ = extended_euclid (a , m ) s1 = x s2 = x + ( m // gcd ) return s1 + s2 def find_inverse ( a : int , m : int ) -> int :\\"\\"\\" find the inverse of $a$ modulo $m$ \\"\\"\\" _ , x , _ = extended_euclid (a , m ) return x % m if x >= 0 else ( x + m ) % m def extended_euclid ( a : int , b : int ) -> Tuple [ int , int , int ]for Divide You are a programming copilot , you can solve a problem by writing Python functions . Your task is to : -For every turn , you need to write a Python function that returns the answer based on Current Code ( not code in chat history ) . -Do not modify function name , arg names , docstring in given functions . -You can import libraries to better solve the problem . -If a single function is too hard to solve , you can decompose it into multiple smaller functions . -You can leave new function unimplemented for now , but write the function at the end of the code and comment what the function does .def sum_common_factors( a : int , b : int ) -> int : \\"\\"\\" Compute the sum of all common prime factors of $a$ and $b$ \\"\\"\\" raise NotImplementedError () ``L et \' s think step by step and complete the following Python function sum_common_factors `that solves : " Compute the sum of all common prime factors of $a$ and $b$ " < Assistant >: First , I need to get the prime factors of $a$ and $b$ . Second , I can use `for `loop to find common element in two factors list . Finally , sum the common factor list and return the answer . Here is the `sum_common_factors `function : ```python def sum_common_factors ( a : int , b : int ) -> int : \\"\\"\\" Compute the sum of all common prime factors of $a$ and $b$ \\"\\"\\" factors_a = prime_factor ( a ) factors_b = prime_factor ( b ) common_factors = get_common ( factors_a , factors_b ) return sum ( common_factors ) def prime_factor ( x : int ) -> list : \\"\\"\\" get a list of prime factors of number $x$ \\"\\"\\" raise NotImplementedError () def get_common ( a : list , b : list ) -> list : \\"\\"\\" get common element in two list $a$ and $b$ \\"\\"\\" raise NotImplementedError def sum_common_factors ( a : int , b : int ) -> int : \\"\\"\\" Compute the sum of all common prime factors of $a$ and $b$ \\"\\"\\" factors_a = prime_factor ( a ) factors_b = prime_factor ( b ) common_factors = get_common ( factors_a , factors_b ) return sum ( common_factors ) def get_common ( a : list , b : list ) -> list : \\"\\"\\" get common element in two list $a$ and $b$ \\"\\"\\" raise NotImplementedError () ``L et \' s think step by step and complete the following Python function get_common `that solves : " get common element in two list $a$ and $b$ " < Assistant >: Here is the `get_common `function : ```python def get_common ( a : list , b : list ) -> list :', '381c384', '< Caption: First, I need to get the prime factors of $a$ and $b$ . Second , I can use `for `loop to find common element in two factors list . Finally , sum the common factor list and return the answer . Here is the `sum_common_factors `function : ```python def sum_common_factors ( a : int , b : int ) -> int :""" Compute the sum of all common prime factors of $a$ and $b$ "', '---', '> Caption: First, I need to get the prime factors of $a$ and $b$ . Second , I can use `for `loop to find common element in two factors list . Finally , sum the common factor list and return the answer . Here is the `sum_common_factors `function : ```python def sum_common_factors ( a : int , b : int ) -> int :\\"\\"\\" Compute the sum of all common prime factors of $a$ and $b$ "', '472c475', '< Data: assert lcm (1 , 5) == 5Let \' s think step by step and create some tests for the following assert lcm (1 , 1) == 1function `{ cur_func_name }(...) `in Python . assert lcm (17 , 19) == 17 * 19``<```python User >:{ prev_code }``S Extract tests for the following function `{ cur_func_name }(...) `inPython .tore your function calls for `{ cur_func_name }(...) `as functioncallss , one per line . They will be called later . ```python{ prev_code }C.5 Prompt for Self-Test Generation ``SYou are a proficient software engineer and architect , specialized in tore your test cases for `{ cur_func_name }(...) `as assertions , onetesting , capable of observing and finding security holes and per line . They will be called later .logical issues in code that spans across multiple computer sciencefields and mathematics . You are given a piece of Python function ,and are expected to produce some test cases for that function .Specifically :-You should invoke the function and assert its results in a one -liner fashion .-Here follows a series of mutually uncorrelated functions to test , oneper conversation .< User >:Let \' s think step by step and create some tests for the followingfunction `lcm (...) `in Python .```pythondef lcm ( a : int , b : int ) -> int :""" Find the least common multiple of `a `and `b `. Samples :>>> lcm (3 , 5)15>>> lcm (4 , 6)12"""return round ( a * b / gcd (a , b ) )``Store your test cases for `lcm (...) `as assertions , one per line . Theywill be called later .< Assistant >:```python Sure , I can create some test cases for the `check_valid_bracketsfunction{ prev_code }. We consider the following cases : 1. the two operands are``Lnot co -prime and has common factors ; 2. the two operands areequal ; 3. one of them is 1; 4. two of them is 1; 5. both operandset \' s think step by step and implement the following method are primes . Here is an example of these test cases in Python : `{cur_func_name } `using existing functions to solve :"{ cur_func_doc }" ```pythonassert lcm (15 , 25) == 75assert lcm (32 , 32) == 32', '---', '> Data: assert lcm (1 , 5) == 5Let \' s think step by step and create some tests for the following assert lcm (1 , 1) == 1function `{ cur_func_name }(...) `in Python . assert lcm (17 , 19) == 17 * 19``<```python User >:{ prev_code }``S Extract tests for the following function `{ cur_func_name }(...) `inPython .tore your function calls for `{ cur_func_name }(...) `as functioncallss , one per line . They will be called later . ```python{ prev_code }C.5 Prompt for Self-Test Generation ``SYou are a proficient software engineer and architect , specialized in tore your test cases for `{ cur_func_name }(...) `as assertions , onetesting , capable of observing and finding security holes and per line . They will be called later .logical issues in code that spans across multiple computer sciencefields and mathematics . You are given a piece of Python function ,and are expected to produce some test cases for that function .Specifically :-You should invoke the function and assert its results in a one -liner fashion .-Here follows a series of mutually uncorrelated functions to test , oneper conversation .< User >:Let \' s think step by step and create some tests for the followingfunction `lcm (...) `in Python .```pythondef lcm ( a : int , b : int ) -> int :\\"\\"\\" Find the least common multiple of `a `and `b `. Samples :>>> lcm (3 , 5)15>>> lcm (4 , 6)12\\"\\"\\"return round ( a * b / gcd (a , b ) )``Store your test cases for `lcm (...) `as assertions , one per line . Theywill be called later .< Assistant >:```python Sure , I can create some test cases for the `check_valid_bracketsfunction{ prev_code }. We consider the following cases : 1. the two operands are``Lnot co -prime and has common factors ; 2. the two operands areequal ; 3. one of them is 1; 4. two of them is 1; 5. both operandset \' s think step by step and implement the following method are primes . Here is an example of these test cases in Python : `{cur_func_name } `using existing functions to solve :"{ cur_func_doc }" ```pythonassert lcm (15 , 25) == 75assert lcm (32 , 32) == 32', '500d502', '< ']
